Skip to content

Allocation of vnodes on qgpu

Introduction

The qgpu queue on Karolina takes advantage of the division of nodes into vnodes. Accelerated node equipped with two 64-core processors and eight GPU cards is treated as eight vnodes, each containing 16 CPU cores and 1 GPU card. Vnodes can be allocated to jobs individually –⁠ through precise definition of resource list at job submission, you may allocate varying number of resources/GPU cards according to your needs.

Vnodes and Security

Division of nodes into vnodes was implemented to be as secure as possible, but it is still a "multi-user mode", which means that if two users allocate a portion of the same node, they can see each other's running processes. If this solution is inconvenient for you, consider allocating a whole node.

Selection Statement and Chunks

Requested resources are specified using a selection statement:

-l select=[<N>:]<chunk>[+[<N>:]<chunk> ...]

N specifies the number of chunks; if not specified then N = 1.
chunk declares the value of each resource in a set of resources which are to be allocated as a unit to a job.

  • chunk is seen by the MPI as one node.
  • Multiple chunks are then seen as multiple nodes.
  • Maximum chunk size is equal to the size of a full physical node (8 GPU cards, 128 cores)

Default chunk for the qgpu queue is configured to contain 1 GPU card and 16 CPU cores, i.e. ncpus=16:ngpus=1.

  • ncpus specifies number of CPU cores
  • ngpus specifies number of GPU cards

Allocating Single GPU

Single GPU can be allocated in an interactive session using

qsub -q qgpu -A OPEN-00-00 -l select=1 -I

or simply

qsub -q qgpu -A OPEN-00-00 -I

In this case, the ngpus parameter is optional, since it defaults to 1. You can verify your allocation either in the PBS using the qstat command, or by checking the number of allocated GPU cards in the CUDA_VISIBLE_DEVICES variable:

$ qstat -F json -f $PBS_JOBID | grep exec_vnode
    "exec_vnode":"(acn53[0]:ncpus=16:ngpus=1)"

$ echo $CUDA_VISIBLE_DEVICES
GPU-8772c06c-0e5e-9f87-8a41-30f1a70baa00

The output shows that you have been allocated vnode acn53[0].

Allocating Single Accelerated Node

Security tip

Allocating a whole node prevents other users from seeing your running processes.

Single accelerated node can be allocated in an interactive session using

qsub -q qgpu -A OPEN-00-00 -l select=8 -I

Setting select=8 automatically allocates a whole accelerated node and sets mpiproc. So for N full nodes, set select to N x 8. However, note that it may take some time before your jobs are executed if the required amount of full nodes isn't available.

Allocating Multiple GPUs

Security risk

If two users allocate a portion of the same node, they can see each other's running processes. When required for security reasons, consider allocating a whole node.

Again, the following examples use only the selection statement, so no additional setting is required.

qsub -q qgpu -A OPEN-00-00 -l select=2 -I

In this example two chunks will be allocated on the same node, if possible.

qsub -q qgpu -A OPEN-00-00 -l select=16 -I

This example allocates two whole accelerated nodes.

Multiple vnodes within the same chunk can be allocated using the ngpus parameter. For example, to allocate 2 vnodes in an interactive mode, run

qsub -q qgpu -A OPEN-00-00 -l select=1:ngpus=2:mpiprocs=2 -I

Remember to set the number of mpiprocs equal to that of ngpus to spawn an according number of MPI processes.

To verify the correctness:

$ qstat -F json -f $PBS_JOBID | grep exec_vnode
    "exec_vnode":"(acn53[0]:ncpus=16:ngpus=1+acn53[1]:ncpus=16:ngpus=1)"

$ echo $CUDA_VISIBLE_DEVICES | tr ',' '\n'
GPU-8772c06c-0e5e-9f87-8a41-30f1a70baa00
GPU-5e88c15c-e331-a1e4-c80c-ceb3f49c300e

The number of chunks to allocate is specified in the select parameter. For example, to allocate 2 chunks, each with 4 GPUs, run

qsub -q qgpu -A OPEN-00-00 -l select=2:ngpus=4:mpiprocs=4 -I

To verify the correctness:

$ cat > print-cuda-devices.sh <<EOF
#!/bin/bash
echo \$CUDA_VISIBLE_DEVICES
EOF

$ chmod +x print-cuda-devices.sh
$ ml OpenMPI/4.1.4-GCC-11.3.0
$ mpirun ./print-cuda-devices.sh | tr ',' '\n' | sort | uniq
GPU-0910c544-aef7-eab8-f49e-f90d4d9b7560
GPU-1422a1c6-15b4-7b23-dd58-af3a233cda51
GPU-3dbf6187-9833-b50b-b536-a83e18688cff
GPU-3dd0ae4b-e196-7c77-146d-ae16368152d0
GPU-93edfee0-4cfa-3f82-18a1-1e5f93e614b9
GPU-9c8143a6-274d-d9fc-e793-a7833adde729
GPU-ad06ab8b-99cd-e1eb-6f40-d0f9694601c0
GPU-dc0bc3d6-e300-a80a-79d9-3e5373cb84c9

Comments