# Troubleshooting Cases ## `IDLE+NOT_RESPONDING` after node maintenance Symptoms: `sinfo` shows `idle*` or `scontrol show node` shows `IDLE+NOT_RESPONDING`. Actions: ```bash systemctl restart munge systemctl restart slurmd systemctl restart slurmctld scontrol update NodeName= State=RESUME || true scontrol update NodeName= State=UNDRAIN || true scontrol update NodeName= State=IDLE || true ``` ## Missing GPU TRES Symptoms: `sacctmgr` fails with `no TRES known by type gres/gpu`. Fix: add `AccountingStorageTRES=...,gres/gpu`, restart/reconfigure Slurm, run a GPU job and verify with `sacctmgr show tres`. ## SlurmDBD objects already exist Symptoms: `sacctmgr` returns `Nothing new added` or `Already existing`. Fix: make Ansible tasks idempotent: attempt the change, tolerate known existing-object messages, then normalize state with `modify`.