« Monster Truck Madness! | Main | SKA Pathfinder Video »

Basic Parallelisation in Bash

I have recently started writing some quite CPU intensive code and since we have a nice cluster here (without any management software on it) I decided that it would be best for me to take advantage of the number of cores on them. Actually, this works nicely on my desktop which has four cores anyway (and 4GB of RAM). Essentially, I'm lazy and this basically runs my pipeline in parallel for different sources by sending off different jobs to different cores (to the maximum number that you specify)... and then when they finish runs then next few... and so on until they are all done. No longer do I have to worry about waiting for the jobs to finish and wasting time by missing their finishing point. I also no longer have to have lots of [screen] sessions or loads of terminals open... bliss...

Oh here is my basic [Bash] script:

#!/bin/bash
#By Samuel George; www.krioma.net
#Original: 27/02/2010
array=(`ls -d */`)
#My scripts run in sub directories, replace with your list of commands to run.
len=${#array[*]}
maxjobs=1
jobnumber=0
#loop over the maximum number of jobs based on the number of files in array
while [ $jobnumber -lt $len ]; do
jobsrunning=0
while [ $jobsrunning -le $maxjobs ]
#start jobs up till maximum and then wait for them to finish before continuing.
do
#go into the directory and run - this is an oddity of my processing
#replace with your own functions
cd "${array[$jobnumber]}"
"run.sh" &
#go back a dir
cd ../
#add to counter so that you know how many jobs are running at once.
jobsrunning=$(( $jobsrunning + 1 ))
#keep a running total, such that the script will loop over all the jobs you want running jobnumber=$(( $jobnumber + 1 ))
done
wait
#keep a listing of where you are.
echo $jobnumber
done

Stick this in a shell script, chmod 700 file.sh and job done.

There is definitely room for improvement here. For example this code will wait till all of the processes in the inner loop, ideally you'd want it to move on after the first one is finished. Watch this space... well that might not happen since my tasks all take about the same time to finish in.

TrackBack

TrackBack URL for this entry:
http://www.krioma.net/cgi-bin/mt_new/mt-tb.cgi/427

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About

This page contains a single entry from the blog posted on February 28, 2010 5:04 AM.

The previous post in this blog was Monster Truck Madness!.

The next post in this blog is SKA Pathfinder Video.

Many more can be found on the main index page or by looking through the archives.

Get Firefox! Valid XHTML 1.0! Valid CSS! RSS Feed BlogUniverse - listed Powered by Apache Creative Commons License ringsofsaturnrock's Most Interesting Photos on Flickriver

Powered by
Movable Type 3.33