# platform-projects This directory contains larger infrastructure platform topics and case studies. Most subdirectories are planning areas unless their own README says otherwise. ## Implemented platform projects - [hpc-slurm-ai-cluster](./hpc-slurm-ai-cluster/) - Slurm AI/HPC cluster automation covering Ansible-managed Slurm operations, GPU scheduling with GRES, cgroup enforcement, SlurmDBD accounting, QOS/fairshare/priority, node lifecycle operations, rolling upgrades, and health remediation. ## Planning areas These subdirectories are intentionally light and should be read as planning areas unless their own README says otherwise: - `monitoring-zabbix` - `elk-log-analysis` - `storage` - `clustering` - `virtualization` Planned platform topics are tracked in [ROADMAP.md](../ROADMAP.md). Keep future additions operational: scope, topology, validation, limitations, and runbook links should matter more than diagrams or buzzwords. For Codex-driven changes, use [AGENTS.md](../AGENTS.md) and the templates under [docs/codex](../docs/codex/).