Storage¶
Karolina cluster provides two main shared filesystems, HOME filesystem and SCRATCH filesystem, and has access to IT4Innovations' central PROJECT storage, as well. All login and compute nodes may access the same data on shared file systems. Compute nodes are also equipped with local (non-shared) scratch, RAM disk, and TMP file systems.
Archiving¶
Shared filesystems should not be used as a backup for large amount of data or long-term data storage. The academic staff and students of research institutions in the Czech Republic can use the CESNET storage service, which is available via SSHFS.
HOME File System¶
The HOME filesystem is an HA cluster of two active-passive NFS servers. This filesystem contains users' home directories /home/username
. Accessible capacity is 31 TB, shared among all users. Individual users are restricted by filesystem usage quotas, set to 25 GB per user. Should 25 GB prove insufficient, contact support, the quota may be increased upon request.
Note
The HOME filesystem is intended for preparation, evaluation, processing and storage of data generated by active projects.
The files on HOME filesystem will not be deleted until the end of the user's lifecycle.
The filesystem is backed up, so that it can be restored in case of a catastrophic failure resulting in significant data loss. However, this backup is not intended to restore old versions of user data or to restore deleted files.
HOME filesystem | |
---|---|
Mountpoint | /home/username |
Capacity | 31 TB |
Throughput | 1.93 GB/s write, 3.1 GB/s read |
User space quota | 25 GB |
User inodes quota | 500 k |
Protocol | NFS |
Configuration of the storage:
2x NFS server HPE ProLiant DL325 Gen10 Plus
- 1x AMD EPYC 7302P (3.0GHz/16-core/155W)
- 8x 16GB (1x16GB) Dual Rank x8 DDR4-3200 CAS-22-22-22
- 2x 240GB SATA 6G Read Intensive SFF (2.5in) SC SSD – (HW RAID1)
- 1x Smart Array E208i-a SR Gen10 (No Cache) 12G SAS Modular LH Controller
- 1x HPE SN1100Q 16Gb Dual Port Fibre Channel Host Bus Adapter
- 1x Intel I350-T4 Ethernet 1Gb 4-port BASE-T OCP3 Adapter
- ILO5
- 1x InfiniBand HDR100/Ethernet 100Gb 2-port QSFP56 PCIe4 x16 MCX653106A-ECAT Adapter
- 2x 500W Flex Slot Platinum Hot Plug Low Halogen Power Supply Kit
- OS: Red Hat Enterprise Linux Server
1x Storage array HPE MSA 2060 16Gb Fibre Channel SFF Storage
- 1x Base MSA 2060 SFF Storage Drive Enclosure
- 22x MSA 1.92TB SAS 12G SFF (2.5in) M2 SSD
- 1x MSA 16Gb Short Wave Fibre Channel SFP+ 4-pack Transceiver
- Dual-controller, 4x 16Gb FC host interface
- LAN connectivity 2x 1Gb/s
- Redundant, hot-swap power supplies
SCRATCH File System¶
The SCRATCH filesystem is realized as a parallel Lustre filesystem. It is accessible via the Infiniband network and is available from all login and compute nodes. Extended ACLs are provided on the Lustre filesystems for sharing data with other users using fine-grained control. For basic information about Lustre, see the Understanding the Lustre Filesystems subsection of the Barbora's storage documentation.
The SCRATCH filesystem is mounted in the /scratch/project/PROJECT_ID
directory created automatically with the PROJECT_ID
project. Accessible capacity is 1000 TB, shared among all users. Users are restricted by PROJECT quotas set to 20 TB. The purpose of this quota is to prevent runaway programs from filling the entire filesystem and deny service to other users. Should 20 TB prove insufficient, contact support, the quota may be increased upon request.
To find out current SCRATCH quotas, use:
[usr0123@login1.karolina ~]$ getent group OPEN-XX-XX
open-xx-xx:*:1234:user1,...,usern
[usr0123@login1.karolina ~]$ lfs quota -p 1234 /scratch/
Disk quotas for prj 1234 (pid 1234):
Filesystem kbytes quota limit grace files quota limit grace
/scratch/ 14356700796 0 19531250000 - 82841 0 20000000 -
Note
The Scratch filesystem is intended for temporary scratch data generated during the calculation as well as for high-performance access to input and output files. All I/O intensive jobs must use the SCRATCH filesystem as their working directory.
Users are advised to save the necessary data from the SCRATCH filesystem to HOME filesystem after the calculations and clean up the scratch files.
Warning
Files on the SCRATCH filesystem that are not accessed for more than 90 days will be automatically deleted.
SCRATCH filesystem | |
---|---|
Mountpoint | /scratch |
Capacity | 1361 TB |
Throughput | 730.9 GB/s write, 1198.3 GB/s read |
PROJECT quota | 20 TB |
PROJECT inodes quota | 20 M |
Default stripe size | 1 MB |
Default stripe count | 1 |
Protocol | Lustre |
Configuration of the storage:
1x SMU - ClusterStor 2U/24 System Management Unit Storage Controller
- 5x Cray ClusterStor 1.6TB NVMe x4 Lanes Mixed Use SFF (2.5in) U.2 with Carrier
- 2x Cray ClusterStor InfiniBand HDR/Ethernet 200Gb 1-port QSFP PCIe4 Adapter (Mellanox ConnectX-6)
1x MDU - ClusterStor 2U/24 Metadata Unit Storage Controller
- 24x Cray ClusterStor 1.6TB NVMe x4 Lanes Mixed Use SFF (2.5in) U.2 with Carrier
- 2x Cray ClusterStor InfiniBand HDR/Ethernet 200Gb 1-port QSFP PCIe4 Adapter (Mellanox ConnectX-6)
24x SSU-F - ClusterStor 2U24 Scalable Storage Unit Flash Storage Controller
- 24x Cray ClusterStor 3.2TB NVMe x4 Lanes Mixed Use SFF (2.5in) U.2 with Carrier
- 4x Cray ClusterStor InfiniBand HDR/Ethernet 200Gb 1-port QSFP PCIe4 Adapter (Mellanox ConnectX-6)
2x LMN - Aruba 6300M 48-port 1GbE
- Aruba X371 12VDC 250W 100-240VAC Power-to-Port Power Supply
PROJECT File System¶
The PROJECT data storage is a central storage for projects' and users' data at IT4Innovations that is accessible from all clusters. For more information, see the PROJECT Data Storage section.
Disk Usage and Quota Commands¶
For more information about disk usage and user quotas, see the Barbora's storage section.
Extended ACLs¶
Extended ACLs provide another security mechanism beside the standard POSIX ACLs, which are defined by three entries (for owner/group/others). Extended ACLs have more than the three basic entries. In addition, they also contain a mask entry and may contain any number of named user and named group entries.
ACLs on a Lustre file system work exactly like ACLs on any Linux file system. They are manipulated with the standard tools in the standard manner.
For more information, see the Access Control List section of the documentation.
Local Filesystems¶
TMP¶
Each node is equipped with a local /tmp
directory of few GB capacity. The /tmp
directory should be used to work with small temporary files. Old files in the /tmp
directory are automatically purged.
Summary¶
Mountpoint | Usage | Protocol | Net Capacity | Throughput | Limitations | Access | Services | |
---|---|---|---|---|---|---|---|---|
/home | home directory | NFS | 31 TB | 1.93 GB/s write, 3.1 GB/s read | Quota 25 GB | Compute and login nodes | backed up | |
/scratch | cluster shared jobs' data | Lustre | 1361 TB | 730.9 GB/s write, 1198.3 GB/s read | Quota 20 TB | Compute and login nodes | files older 90 days removed | |
/tmp | local temporary files | local | ------ | ------- | none | Compute / login nodes | auto | purged |