Capacity Computing¶

Introduction¶

In many cases, it is useful to submit a huge (>100) number of computational jobs into the Slurm queue system. A huge number of (small) jobs is one of the most effective ways to execute embarrassingly parallel calculations, achieving the best runtime, throughput, and computer utilization.

However, executing a huge number of jobs via the Slurm queue may strain the system. This strain may result in slow response to commands, inefficient scheduling, and overall degradation of performance and user experience for all users.

Note

Follow one of the procedures below, in case you wish to schedule more than 100 jobs at a time.

You can use HyperQueue when running a huge number of jobs. HyperQueue can help efficiently load balance a large number of jobs amongst available computing nodes.