Resources Allocation Policy¶
Job Queue Policies¶
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and resources available to the Project. The fair-share at Anselm ensures that individual users may consume approximately equal amount of resources per week. Detailed information in the Job scheduling section. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following table provides the queue partitioning overview:
Check the queue status at https://extranet.it4i.cz/rsweb/salomon/
|queue||active project||project resources||nodes||min ncpus||priority||authorization||walltime|
|qexp Express queue||no||none required||32 nodes, max 8 per user||24||150||no||1 / 1h|
|qprod Production queue||yes||> 0||1006 nodes, max 86 per job||24||0||no||24 / 48h|
|qlong Long queue||yes||> 0||256 nodes, max 40 per job, only non-accelerated nodes allowed||24||0||no||72 / 144h|
|qmpp Massive parallel queue||yes||> 0||1006 nodes||24||0||yes||2 / 4h|
|qfat UV2000 queue||yes||> 0||1 (uv1)||8||0||yes||24 / 48h|
|qfree Free resource queue||yes||none required||752 nodes, max 86 per job||24||-1024||no||12 / 12h|
|qviz Visualization queue||yes||none required||2 (with NVIDIA Quadro K5000)||4||150||no||1 / 8h|
The qfree queue is not free of charge. Normal accounting applies. However, it allows for utilization of free resources, once a Project exhausted all its allocated computational resources. This does not apply to Directors Discretion (DD projects) but may be allowed upon request.
- qexp, the Express queue: This queue is dedicated for testing and running very small jobs. It is not required to specify a project to enter the qexp. There are 2 nodes always reserved for this queue (w/o accelerator), maximum 8 nodes are available via the qexp for a particular user. The nodes may be allocated on per core basis. No special authorization is required to use it. The maximum runtime in qexp is 1 hour.
- qprod, the Production queue: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, however only 86 per job. Full nodes, 24 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours.
- qlong, the Long queue: This queue is intended for long production runs. It is required that active project with nonzero remaining resources is specified to enter the qlong. Only 336 nodes without acceleration may be accessed via the qlong queue. Full nodes, 24 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qlong is 144 hours (three times of the standard qprod time - 3 * 48 h)
- qmpp, the massively parallel queue. This queue is intended for massively parallel runs. It is required that active project with nonzero remaining resources is specified to enter the qmpp. All nodes may be accessed via the qmpp queue. Full nodes, 24 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qmpp is 4 hours. An PI needs explicitly ask support for authorization to enter the queue for all users associated to her/his Project.
- qfat, the UV2000 queue. This queue is dedicated to access the fat SGI UV2000 SMP machine. The machine (uv1) has 112 Intel IvyBridge cores at 3.3GHz and 3.25TB RAM (8 cores and 128GB RAM are dedicated for system). An PI needs explicitly ask support for authorization to enter the queue for all users associated to her/his Project.
- qfree, the Free resource queue: The queue qfree is intended for utilization of free resources, after a Project exhausted all its allocated computational resources (Does not apply to DD projects by default. DD projects have to request for persmission on qfree after exhaustion of computational resources.). It is required that active project is specified to enter the queue, however no remaining resources are required. Consumed resources will be accounted to the Project. Only 178 nodes without accelerator may be accessed from this queue. Full nodes, 24 cores per node are allocated. The queue runs with very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours.
- qviz, the Visualization queue: Intended for pre-/post-processing using OpenGL accelerated graphics. Currently when accessing the node, each user gets 4 cores of a CPU allocated, thus approximately 73 GB of RAM and 1/7 of the GPU capacity (default "chunk"). If more GPU power or RAM is required, it is recommended to allocate more chunks (with 4 cores each) up to one whole node per user, so that all 28 cores, 512 GB RAM and whole GPU is exclusive. This is currently also the maximum allowed allocation per one user. One hour of work is allocated by default, the user may ask for 2 hours maximum.
To access node with Xeon Phi co-processor user needs to specify that in job submission select statement.
The job wall-clock time defaults to half the maximum time, see table above. Longer wall time limits can be set manually, see examples.
Jobs that exceed the reserved wall-clock time (Req'd Time) get killed automatically. Wall-clock time limit can be changed for queuing jobs (state Q) using the qalter command, however can not be changed for a running job (state R).
Salomon users may check current queue configuration at https://extranet.it4i.cz/rsweb/salomon/queues.
Check the status of jobs, queues and compute nodes at https://extranet.it4i.cz/rsweb/salomon/
Display the queue status on Salomon:
$ qstat -q
The PBS allocation overview may be obtained also using the rspbs command.
$ rspbs Usage: rspbs [options] Options: --version show program's version number and exit -h, --help show this help message and exit --get-server-details Print server --get-queues Print queues --get-queues-details Print queues details --get-reservations Print reservations --get-reservations-details Print reservations details --get-nodes Print nodes of PBS complex --get-nodeset Print nodeset of PBS complex --get-nodes-details Print nodes details --get-jobs Print jobs --get-jobs-details Print jobs details --get-jobs-check-params Print jobid, job state, session_id, user, nodes --get-users Print users of jobs --get-allocated-nodes Print allocated nodes of jobs --get-allocated-nodeset Print allocated nodeset of jobs --get-node-users Print node users --get-node-jobs Print node jobs --get-node-ncpus Print number of ncpus per node --get-node-allocated-ncpus Print number of allocated ncpus per node --get-node-qlist Print node qlist --get-node-ibswitch Print node ibswitch --get-user-nodes Print user nodes --get-user-nodeset Print user nodeset --get-user-jobs Print user jobs --get-user-jobc Print number of jobs per user --get-user-nodec Print number of allocated nodes per user --get-user-ncpus Print number of allocated ncpus per user --get-qlist-nodes Print qlist nodes --get-qlist-nodeset Print qlist nodeset --get-ibswitch-nodes Print ibswitch nodes --get-ibswitch-nodeset Print ibswitch nodeset --summary Print summary --get-node-ncpu-chart Obsolete. Print chart of allocated ncpus per node --server=SERVER Use given PBS server --state=STATE Only for given job state --jobid=JOBID Only for given job ID --user=USER Only for given user --node=NODE Only for given node --nodestate=NODESTATE Only for given node state (affects only --get-node* --get-qlist-* --get-ibswitch-* actions) --incl-finished Include finished jobs
Resource Accounting Policy¶
Wall-Clock Core-Hours WCH¶
The wall-clock core-hours (WCH) are the basic metric of computer utilization time. 1 wall-clock core-hour is defined as 1 processor core allocated for 1 hour of wall-clock time. Allocating a full node (16 cores Anselm, 24 cores Salomon) for 1 hour amounts to 16 wall-clock core-hours (Anselm) or 24 wall-clock core-hours (Salomon).
Normalized Core-Hours NCH¶
The resources subject to accounting are the normalized core-hours (NCH). The normalized core-hours are obtained from WCH by applying a normalization factor:\[ NCH = F*WCH \]
All jobs are accounted in normalized core-hours, using factor F valid at the time of the execution:
|Salomon||1.00||2017-09-11 to 2018-06-01|
|Anselm||0.65||2017-09-11 to 2018-06-01|
The accounting runs whenever the computational cores are allocated via the PBS Pro workload manager (the qsub command), regardless of whether the cores are actually used for any calculation.
The allocations are requested/granted in normalized core-hours NCH.
Whenever the term core-hour is used in this documentation, we mean the normalized core-hour, NCH.
The normalized core-hours were introduced to treat systems of different age on equal footing. Normalized core-hour is an accounting tool to discount the legacy systems. The past (before 2017-09-11) F factors are all 1.0. In future, the factors F will be updated, as new systems are installed. Factors F are expected to only decrease in time.
See examples in the Job submission and execution section.
Check how many core-hours have been consumed. The command it4ifree is available on cluster login nodes.
$ it4ifree Projects I am participating in ============================== PID Days left Total Used WCHs Used NCHs WCHs by me NCHs by me Free ---------- ----------- ------- ----------- ----------- ------------ ------------ ------- OPEN-XX-XX 323 0 5169947 5169947 50001 50001 1292555 Projects I am Primarily Investigating ===================================== PID Login Used WCHs Used NCHs ---------- ---------- ----------- ----------- OPEN-XX-XX user1 376670 376670 user2 4793277 4793277 Legend ====== WCH = Wall-clock Core Hour NCH = Normalized Core Hour
The it4ifree command is a part of it4i.portal.clients package, located here: https://pypi.python.org/pypi/it4i.portal.clients