CI/CD¶
Introduction¶
Continuous Integration (CI) is the practice of automatically executing a compilation script and set of test cases to ensure that the integrated codebase is in a workable state. The integration is often followed by Continuous Benchmarking (CB) to evaluate the impact of the code change on the application performance and Continuous Deployment (CD) to distribute a new version of the developed code.
IT4I offers its users the possibility to set up CI for their projects to execute their dedicated CI jobs in a customizable virtual environment (docker containers), or test the code directly in computational nodes of the production HPC clusters (Karolina, Barbora) and Complementary systems. The Complementary systems gives a possibility to run the tests on emerging, non-traditional, and highly specialized hardware architectures. It consists of computational nodes built on Intel Sapphire Rapids + HBM, NVIDIA Grace CPU, IBM Power10, A64FX, and many more.
Continuous Integration Infrastructure Deployed at IT4I¶
IT4Innovations maintains a GitLab server (code.it4i.cz), which has built-in support for CI/CD. It provides a set of GitLab runners, which is an application that executes jobs specified in the project CI/CD pipelines, consisting of jobs and stages. Grouping jobs together in collections is called stages. Stages run in sequence, while all jobs in a stage can run in parallel.
Detailed documentation about GitLab CI/CD is available here.
Karolina, Barbora, and Complementary Systems¶
For all the users, a unified solution is provided to let them execute their CI jobs at Karolina, Barbora, and Complementary systems without the need to create their own project runners. For each of the HPC clusters, a GitLab instance runner has been deployed. The runners are running in the login nodes and are visible to all the projects of the IT4I GitLab server. These runners are shared by all users.
These runners are using Jacamar CI driver – an HPC-focused open-source CI/CD driver for GitLab runners. It allows a GitLab runner to interact directly with a job scheduler of a given cluster. One of the main benefits this driver provides is a downscoping mechanism. It ensures that every command within each CI job is executed as the user who triggers the CI pipeline to which the job belongs.
For more information about the Jacamar CI driver, please visit the official documentation.
The execution of CI pipelines works as follows. First, a user in the IT4I GitLab server triggers a CI pipeline (for example, by making push to a repository, etc.). Then, the jobs, which the pipeline consists of, are sent to the corresponding runner, running in the login node. Lastly, for every CI job, the runner clones the repository (or just fetches changes to an already cloned one, if there are any) and submits the job as a Slurm job to the corresponding HPC cluster using the sbatch
command. After each execution of a job, the runner reports the results back to the server and uploads the artifacts (if specified).
Note
The GitLab runners at Karolina and Barbora are able to submit (as a Slurm job) and execute 32 CI jobs concurrently, while the runner at Complementary systems can submit 16 jobs concurrently at most. Jobs above this limit are postponed in submission to respective slurm queue until a previous job has finished.
How to Set Up Continuous Integration for Your Project¶
To begin with, a CI pipeline of a project must be defined in a YAML file. The most common name of this file is .gitlab-ci.yml
and it should be located in the repository top level. For detailed information, see tutorial on how to create your first pipeline. Additionally, CI/CD YAML syntax reference lists all possible keywords, that can be specified in the definition of CI/CD pipelines and jobs.
Execution of CI Pipelines at the HPC Clusters¶
Every CI job in the project CI pipeline, intended to be submitted as a Slurm job to one of the HPC clusters, must have 3 following keywords specified in its definition.
id_tokens
, in whichSITE_ID_TOKEN
must be defined withaud
set to the URL of IT4I GitLab server.
id_tokens:
SITE_ID_TOKEN:
aud: https://code.it4i.cz/
tags
, by which the appropriate runner for the CI job is selected. There are exactly 3 tags that must be specified in thetags
clause of the CI job. Two of these areit4i
andslurmjob
. The third one represents name of the target cluster. It can bekarolina
,barbora
, orcompsys
.
tags:
- it4i
- karolina/barbora/compsys
- slurmjob
variables
, where theSCHEDULER_PARAMETERS
variable must be specified. This variable should contain all the arguments that the developer wants to pass to thesbatch
command during the submission of the CI job - project, queue, partition, etc. There are also arguments, which are specified by the Jacamar CI driver automatically. Those are--wait
,--job-name
, and--output
.
variables:
SCHEDULER_PARAMETERS: ‘-A ... –p ... -N ...’
Optionally, a custom build directory can also be specified. The deployed GitLab runners are configured to store all files and directories for the CI job in the home directory of the user, who triggers the associated CI pipeline (the repository is also cloned there in a unique subpath). This behavior can be changed by specifying the CUSTOM_CI_BUILDS_DIR
variable in the variables
clause of the CI job.
variables:
SCHEDULER_PARAMETERS: ...
CUSTOM_CI_BUILDS_DIR: /path/to/custom/build/dir/
A GitLab repository with examples of CI jobs can be found here.