This is an old revision of the document!

slurm

Pittsburgh Super Computer uses slurm for job scheduling. It's an alternative to PBS or torque. We interact with slurm using sbatch, squeue

Resources

See

Usage

export SUBJECT=ABCD
sbatch -o $logfile -e $logfile -J $jobname my_batch_job.bash
 
# can also use export to explicitly variables used by batch script
sbatch --export=ALL,SUBJECT=ABCD  my_batch_job.bash

Before launching a job, you need to know

how long the job will take (“walltime”). over estimate.
- if you underestimate, the job will be killed by the scheduler before it finishes
- the higher your estimate, the longer it'll take your job to leave the queue and start running
- 1000s of very short jobs will also be penalized by the scheduler
how many cores (forks, threads, tasks) to use. This sets cpu hours “billing.”
- You're charged walltime*requested cores. Even if you don't use the cores, you're blocking others from them.

Use sbatch to submit a bash script with special comments for slurm node settings (expected runtime, number of CPU cores). Use global variables for settings.

#!/bin/bash
#SBATCH --partition=RM-shared
#SBATCH --ntasks-per-node=1
#SBATCH --nodes=1
#SBATCH --time=8:00
# above allocates 8 hours
# Acceptable time formats include:
#  "minutes", "minutes:seconds", "hours:minutes:seconds",
#  "days-hours", "days-hours:minutes" "days-hours:minutes:seconds"


# example command using global "$SUBJECT" variable
long_running_process /path/to/$SUBJECT