Skip to content

Karolina - Job Submission and Execution

Introduction

Slurm workload manager is used to allocate and access Karolina cluster's resources. This page describes Karolina cluster's specific Slurm settings and usage. General information about Slurm usage at IT4Innovations can be found at Slurm Job Submission and Execution.

Partition Information

Partitions/queues on the system:

$ sinfo -s
PARTITION    AVAIL  TIMELIMIT   NODES(A/I/O/T) NODELIST
qcpu*           up 2-00:00:00      1/717/0/718 cn[001-718]
qcpu_biz        up 2-00:00:00      1/717/0/718 cn[001-718]
qcpu_exp        up    1:00:00      1/719/0/720 cn[001-720]
qcpu_free       up   18:00:00      1/717/0/718 cn[001-718]
qcpu_long       up 6-00:00:00      1/717/0/718 cn[001-718]
qcpu_preempt    up   12:00:00      1/717/0/718 cn[001-718]
qgpu            up 2-00:00:00        0/70/0/70 acn[01-70]
qgpu_big        up   12:00:00        71/1/0/72 acn[01-72]
qgpu_biz        up 2-00:00:00        0/70/0/70 acn[01-70]
qgpu_exp        up    1:00:00        0/72/0/72 acn[01-72]
qgpu_free       up   18:00:00        0/70/0/70 acn[01-70]
qgpu_preempt    up   12:00:00        0/70/0/70 acn[01-70]
qfat            up 2-00:00:00          0/1/0/1 sdf1
qviz            up    8:00:00          0/2/0/2 viz[1-2]

For more information about Karolina's queues, see this page.

Graphical representation of cluster usage, partitions, nodes, and jobs could be found at https://extranet.it4i.cz/rsweb/karolina

On Karolina cluster

  • all CPU queues/partitions provide full node allocation, whole nodes (all node resources) are allocated to a job.
  • other queues/partitions (gpu, fat, viz) provide partial node allocation. Jobs' resources (cpu, mem) are separated and dedicated for job.

Partial node allocation and security

Division of nodes means that if two users allocate a portion of the same node, they can see each other's running processes. If this solution is inconvenient for you, consider allocating a whole node.

Using CPU Queues

Access standard compute nodes. Whole nodes are allocated. Use the --nodes option to specify the number of requested nodes. There is no need to specify the number of cores and memory size.

#!/usr/bin/bash
#SBATCH --job-name MyJobName
#SBATCH --account PROJECT-ID
#SBATCH --partition qcpu
#SBATCH --time 12:00:00
#SBATCH --nodes 8
...

Using GPU Queues

Nodes per job limit

Because we are still in the process of fine-tuning and setting optimal parameters for SLURM, we have temporarily limited the maximum number of nodes per job on qgpu and qgpu_biz to 16.

Access GPU accelerated nodes. Every GPU accelerated node is divided into eight parts, each part contains one GPU, 16 CPU cores and corresponding memory. By default, only one part, i.e. 1/8 of the node - one GPU and corresponding CPU cores and memory, is allocated. There is no need to specify the number of cores and memory size, on the contrary, it is undesirable. There are employed some restrictions which aim to provide fair division and efficient use of node resources.

#!/usr/bin/bash
#SBATCH --job-name MyJobName
#SBATCH --account PROJECT-ID
#SBATCH --partition qgpu
#SBATCH --time 12:00:00
...

To allocate more GPUs use --gpus option. The default behavior is to allocate enough nodes to satisfy the requested resources as expressed by --gpus option and without delaying the initiation of the job.

The following code requests four GPUs; scheduler can allocate from one up to four nodes depending on the actual cluster state (i.e. GPU availability) to fulfil the request.

#SBATCH --gpus 4

The following code requests 16 GPUs; scheduler can allocate from two up to sixteen nodes depending on the actual cluster state (i.e. GPU availability) to fulfil the request.

#SBATCH --gpus 16

To allocate GPUs within one node you have to specify the --nodes option.

The following code requests four GPUs on exactly one node

#SBATCH --gpus 4
#SBATCH --nodes 1

The following code requests 16 GPUs on exactly two nodes.

#SBATCH --gpus 16
#SBATCH --nodes 2

Alternatively, you can use the --gpus-per-node option. Only value 8 is allowed for multi-node allocation to prevent fragmenting nodes.

The following code requests 16 GPUs on exactly two nodes.

#SBATCH --gpus-per-node 8
#SBATCH --nodes 2

Using Fat Queue

Access data analytics aka fat node. Fat node is divided into 32 parts, each part contains one socket/processor (24 cores) and corresponding memory. By default, only one part, i.e. 1/32 of the node - one processor and corresponding memory, is allocated.

To allocate requested memory use the --mem option. Corresponding CPUs will be allocated. Fat node has about 22.5TB of memory available for jobs.

#!/usr/bin/bash
#SBATCH --job-name MyJobName
#SBATCH --account PROJECT-ID
#SBATCH --partition qfat
#SBATCH --time 2:00:00
#SBATCH --mem 6TB
...

You can also specify CPU-oriented options (like --cpus-per-task), then appropriate memory will be allocated to the job.

To allocate a whole fat node, use the --exclusive option

#SBATCH --exclusive

Using Viz Queue

Access visualization nodes. Every visualization node is divided into eight parts. By default, only one part, i.e. 1/8 of the node, is allocated.

$ salloc -A PROJECT-ID -p qviz

To allocate a whole visualisation node, use the --exclusive option

$ salloc -A PROJECT-ID -p qviz --exclusive