Karolina - Job Submission and Execution¶
Introduction¶
Slurm workload manager is used to allocate and access Karolina cluster's resources. This page describes Karolina cluster's specific Slurm settings and usage. General information about Slurm usage at IT4Innovations can be found at Slurm Job Submission and Execution.
Partition Information¶
Partitions/queues on the system:
$ sinfo -s
PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST
qcpu* up 2-00:00:00 1/717/0/718 cn[001-718]
qcpu_biz up 2-00:00:00 1/717/0/718 cn[001-718]
qcpu_exp up 1:00:00 1/719/0/720 cn[001-720]
qcpu_free up 18:00:00 1/717/0/718 cn[001-718]
qcpu_long up 6-00:00:00 1/717/0/718 cn[001-718]
qcpu_preempt up 12:00:00 1/717/0/718 cn[001-718]
qgpu up 2-00:00:00 0/70/0/70 acn[01-70]
qgpu_big up 12:00:00 71/1/0/72 acn[01-72]
qgpu_biz up 2-00:00:00 0/70/0/70 acn[01-70]
qgpu_exp up 1:00:00 0/72/0/72 acn[01-72]
qgpu_free up 18:00:00 0/70/0/70 acn[01-70]
qgpu_preempt up 12:00:00 0/70/0/70 acn[01-70]
qfat up 2-00:00:00 0/1/0/1 sdf1
qviz up 8:00:00 0/2/0/2 viz[1-2]
For more information about Karolina's queues, see this page.
Graphical representation of cluster usage, partitions, nodes, and jobs could be found at https://extranet.it4i.cz/rsweb/karolina
On Karolina cluster
- all CPU queues/partitions provide full node allocation, whole nodes (all node resources) are allocated to a job.
- other queues/partitions (gpu, fat, viz) provide partial node allocation. Jobs' resources (cpu, mem) are separated and dedicated for job.
Partial node allocation and security
Division of nodes means that if two users allocate a portion of the same node, they can see each other's running processes. If this solution is inconvenient for you, consider allocating a whole node.
Using CPU Queues¶
Access standard compute nodes.
Whole nodes are allocated. Use the --nodes
option to specify the number of requested nodes.
There is no need to specify the number of cores and memory size.
#!/usr/bin/bash
#SBATCH --job-name MyJobName
#SBATCH --account PROJECT-ID
#SBATCH --partition qcpu
#SBATCH --time 12:00:00
#SBATCH --nodes 8
...
Using GPU Queues¶
Nodes per job limit
Because we are still in the process of fine-tuning and setting optimal parameters for SLURM,
we have temporarily limited the maximum number of nodes per job on qgpu
and qgpu_biz
to 16.
Access GPU accelerated nodes. Every GPU accelerated node is divided into eight parts, each part contains one GPU, 16 CPU cores and corresponding memory. By default, only one part, i.e. 1/8 of the node - one GPU and corresponding CPU cores and memory, is allocated. There is no need to specify the number of cores and memory size, on the contrary, it is undesirable. There are employed some restrictions which aim to provide fair division and efficient use of node resources.
#!/usr/bin/bash
#SBATCH --job-name MyJobName
#SBATCH --account PROJECT-ID
#SBATCH --partition qgpu
#SBATCH --time 12:00:00
...
To allocate more GPUs use --gpus
option.
The default behavior is to allocate enough nodes to satisfy the requested resources as expressed by --gpus
option and without delaying the initiation of the job.
The following code requests four GPUs; scheduler can allocate from one up to four nodes depending on the actual cluster state (i.e. GPU availability) to fulfil the request.
#SBATCH --gpus 4
The following code requests 16 GPUs; scheduler can allocate from two up to sixteen nodes depending on the actual cluster state (i.e. GPU availability) to fulfil the request.
#SBATCH --gpus 16
To allocate GPUs within one node you have to specify the --nodes
option.
The following code requests four GPUs on exactly one node
#SBATCH --gpus 4
#SBATCH --nodes 1
The following code requests 16 GPUs on exactly two nodes.
#SBATCH --gpus 16
#SBATCH --nodes 2
Alternatively, you can use the --gpus-per-node
option.
Only value 8 is allowed for multi-node allocation to prevent fragmenting nodes.
The following code requests 16 GPUs on exactly two nodes.
#SBATCH --gpus-per-node 8
#SBATCH --nodes 2
Using Fat Queue¶
Access data analytics aka fat node. Fat node is divided into 32 parts, each part contains one socket/processor (24 cores) and corresponding memory. By default, only one part, i.e. 1/32 of the node - one processor and corresponding memory, is allocated.
To allocate requested memory use the --mem
option.
Corresponding CPUs will be allocated.
Fat node has about 22.5TB of memory available for jobs.
#!/usr/bin/bash
#SBATCH --job-name MyJobName
#SBATCH --account PROJECT-ID
#SBATCH --partition qfat
#SBATCH --time 2:00:00
#SBATCH --mem 6TB
...
You can also specify CPU-oriented options (like --cpus-per-task
), then appropriate memory will be allocated to the job.
To allocate a whole fat node, use the --exclusive
option
#SBATCH --exclusive
Using Viz Queue¶
Access visualization nodes. Every visualization node is divided into eight parts. By default, only one part, i.e. 1/8 of the node, is allocated.
$ salloc -A PROJECT-ID -p qviz
To allocate a whole visualisation node, use the --exclusive
option
$ salloc -A PROJECT-ID -p qviz --exclusive