This commit is contained in:
@@ -1,8 +1,14 @@
|
||||
# platform-projects
|
||||
|
||||
This directory is reserved for larger infrastructure platform topics and future case studies. The current implemented project is [infra-run](../infra-run/).
|
||||
This directory contains larger infrastructure platform topics and case studies. Most subdirectories are planning areas unless their own README says otherwise.
|
||||
|
||||
Current subdirectories are intentionally light and should be read as planning areas unless their own README says otherwise:
|
||||
## Implemented platform projects
|
||||
|
||||
- [hpc-slurm-ai-cluster](./hpc-slurm-ai-cluster/) - Slurm AI/HPC cluster automation covering Ansible-managed Slurm operations, GPU scheduling with GRES, cgroup enforcement, SlurmDBD accounting, QOS/fairshare/priority, node lifecycle operations, rolling upgrades, and health remediation.
|
||||
|
||||
## Planning areas
|
||||
|
||||
These subdirectories are intentionally light and should be read as planning areas unless their own README says otherwise:
|
||||
|
||||
- `monitoring-zabbix`
|
||||
- `elk-log-analysis`
|
||||
|
||||
Reference in New Issue
Block a user