Prometheus exporter for performance metrics from Slurm.
created at Aug. 4, 2020, 8:41 p.m.
A Prometheus exporter for cgroup-level metrics.
created at April 23, 2021, 3:16 p.m.
Prometheus exporter for use with the Lustre parallel filesystem
created at Feb. 9, 2021, 10:45 a.m.
GoSlurmMailer - drop in replacement for default slurm MailProg. Delivers slurm job messages to various destinations.
created at May 16, 2022, 11:55 a.m.
Local filesystem registry for containers (intended for HPC) using Lmod or Environment Modules. Works for users and admins.
created at April 2, 2021, 8:52 p.m.
The source code for the openondemand.org website
created at Jan. 5, 2021, 9:53 p.m.
A coherent Ansible roles collection to simply deploy clusters of nodes.
created at June 26, 2019, 8:18 p.m.
NVIDIA GPU metrics exporter for Prometheus leveraging DCGM
created at Aug. 11, 2021, 3:40 p.m.
A powerful Python framework for writing and running portable regression tests and benchmarks for HPC systems.
created at April 25, 2017, 4:43 p.m.