This is an old revision of the document!
slurm
Pittsburgh Super Computer uses slurm for job scheduling. It's an alternative to PBS or torque. We interact with slurm using sbatch
, squeue
Resources
See
Usage
from a “head” node (ie. after ssh bridges2.psc.edu
) , interactively run
export SUBJECT=ABCD sbatch -o $logfile -e $logfile -J $jobname my_batch_job.bash # can also use export to explicitly variables used by batch script sbatch --export=ALL,SUBJECT=ABCD my_batch_job.bash
Before launching a job, you need to know
- how long the job will take (“walltime”). over estimate.
- if you underestimate, the job will be killed by the scheduler before it finishes
- the higher your estimate, the longer it'll take your job to leave the queue and start running
- 1000s of very short jobs will also be penalized by the scheduler
- how many cores (forks, threads, tasks) to use. This sets cpu hours “billing.”
- You're charged walltime*requested cores. Even if you don't use the cores, you're blocking others from them.
The script given to sbatch
to submit should contain special comments for slurm node settings not specified on the command line. Usually this is the expected runtime and the number of CPU cores. The script should use global variables instead of input arguments/options. Here we use $SUBJECT
in the script and export SUBJECT=
before submitting with sbatch
.
#!/bin/bash #SBATCH --partition=RM-shared #SBATCH --ntasks-per-node=1 #SBATCH --nodes=1 #SBATCH --time=8:00 # above allocates 8 hours # Acceptable time formats include: # "minutes", "minutes:seconds", "hours:minutes:seconds", # "days-hours", "days-hours:minutes" "days-hours:minutes:seconds" # example command using global "$SUBJECT" variable long_running_process /path/to/$SUBJECT