Scalable cluster administration Python framework — Manage node sets, node groups and execute commands on cluster nodes in parallel.
updated at May 16, 2024, 8:36 p.m.
A coherent Ansible roles collection to simply deploy clusters of nodes.
updated at May 16, 2024, 8:12 p.m.
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
updated at May 16, 2024, 2:33 p.m.
Pegasus Workflow Management System - Automate, recover, and debug scientific computations.
updated at May 15, 2024, 11:05 p.m.
Local filesystem registry for containers (intended for HPC) using Lmod or Environment Modules. Works for users and admins.
updated at May 12, 2024, 8:08 a.m.
A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC
updated at May 9, 2024, 3:01 a.m.
Prometheus exporter for use with the Lustre parallel filesystem
updated at April 29, 2024, 6:23 a.m.