Common Slurm Scheduler Commands

Below are common commands used when working with the Slurm job scheduler on the FBRI clusters.

sacct	used to report job or job step accounting information about active or completed jobs.
salloc	used to allocate resources for a job in real time. Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.
sattach	used to attach standard input, output, and error plus signal capabilities to a currently running job or job step. One can attach to and detach from jobs multiple times.
sbatch	used to attach standard input, output, and error plus signal capabilities to a currently running job or job step. One can attach to and detach from jobs multiple times.
sbcast	used to transfer a file from local disk to local disk on the nodes allocated to a job. This can be used to effectively use diskless compute nodes or provide improved performance relative to a shared file system.
scancel	used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step.
scontrol	administrative tool used to view and/or modify SLURM state. Note that many scontrol commands can only be executed as user root.
sinfo	reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options.
smap	reports state information for jobs, partitions, and nodes managed by SLURM, but graphically displays the information to reflect network topology.
squeue	reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order.
srun	used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared nodes within the job's node allocation.
strigger	used to set, get or view event triggers. Event triggers include things such as nodes going down or jobs approaching their time limit.
sview	graphical user interface to get and update state information for jobs, partitions, and nodes managed by SLURM.

Click the link below to download a Slurm command summary PDF.

SLURM-Summary.pdf

5800 KB
View
Download

Choose files or drag and drop files

Tags:

Was this article helpful?

Yes

Jed Krisch
Posted
Updated

Common Slurm Scheduler Commands

SLURM-Summary.pdf

Jed Krisch

Comments