Add Codex repository guidance and validation

2026-05-10 11:11:03 +00:00
parent 0d3905b8a1
commit a527022518
17 changed files with 935 additions and 23 deletions
@@ -0,0 +1,116 @@
 # AGENTS.md
 Guidance for Codex and other automated agents working in this repository.
 ## Purpose
 This repository is a Linux/Unix infrastructure engineering portfolio. It shows practical operational work: incident response, troubleshooting, safe Bash tooling, Ansible hardening examples, storage workflows, runbooks, and platform/lab notes.
 Treat it like internal operations tooling maintained by an infrastructure engineer. Preserve operational realism and avoid generic tutorial or template filler.
 ## Layout
 - `infra-run/` - core operational tooling, Ansible, Bash scripts, runbooks, examples, and operations docs.
 - `platform-projects/` - larger platform topics such as monitoring, storage, clustering, virtualization, and observability.
 - `labs/` - experimental/lab environments for Kubernetes, Terraform, networking, CI/CD, Docker, and related work.
 - `docs/codex/` - Codex workflow guidance, task templates, review checklist, and planning template.
 - `scripts/` - repository validation helpers.
 ## Inspect First
 Before editing, inspect the affected tree and nearby README files. Prefer:
 ```bash
 rg --files
 git status --short
 sed -n '1,220p' <file>
 ```
 Check existing style before introducing new structure. Keep changes small and reviewable.
 ## Validation
 Run the broad repo check when practical:
 ```bash
 ./scripts/validate-repo.sh
 ```
 Focused checks:
 ```bash
 ./scripts/check-bash.sh
 ./scripts/check-ansible.sh
 ./scripts/check-docs.sh
 ```
 Optional strict mode fails when optional tools are missing:
 ```bash
 STRICT=1 ./scripts/validate-repo.sh
 ```
 Also run targeted checks for changed files, such as `bash -n`, `ansible-playbook --syntax-check`, or link checks when relevant.
 ## Bash Standards
 - Use `#!/usr/bin/env bash`.
 - Use `set -o errexit`, `set -o nounset`, and `set -o pipefail`.
 - Validate input before using it.
 - Handle missing commands clearly.
 - Default to read-only or dry-run behavior.
 - Require explicit `--execute` plus confirmation for destructive operations.
 - Use clear `OK`, `WARNING`, and `CRITICAL` output.
 - Exit codes: `0` OK, `1` operational issue, `2` invalid input or missing dependency.
 - Keep scripts readable; separate discovery, pre-check, change, post-check, and reporting when it helps.
 ## Ansible Standards
 - Keep playbooks short and roles simple.
 - Prefer modules over `shell` or `command`.
 - Use `shell` or `command` only when the module set cannot express the operation, and document why if risk is not obvious.
 - Preserve check-mode and diff-mode friendliness where possible.
 - Use handlers, tags, defaults, and validation tasks when they clarify operations.
 - Keep inventory under `inventory/hosts.yml`, `group_vars/`, and `host_vars/`.
 - Do not present selected hardening examples as complete compliance certification.
 ## Documentation Standards
 - Explain what exists, what is planned, and what is intentionally not supported.
 - Prefer runbook style: scope, pre-checks, execution guardrails, rollback thinking, post-checks, and evidence.
 - Avoid marketing language, fake enterprise wording, and tutorial bloat.
 - Update README files and `CHANGELOG.md` when adding meaningful behavior or structure.
 ## Safety Rules
 - Do not run destructive commands.
 - Do not rename large directories unless the benefit is clear and low-risk.
 - Do not hide validation failures.
 - Do not claim live production validation for sanitized examples.
 - Do not add secrets, real hostnames, customer identifiers, or private infrastructure details.
 - Do not turn placeholders into fake completed projects.
 ## PR and Review Expectations
 - State the operational risk of the change.
 - Include commands run and whether tools were missing.
 - Review scripts for dry-run behavior, input validation, dependency handling, and rollback path.
 - Review Ansible for idempotency, check-mode behavior, inventory targeting, tags, handlers, and module choice.
 - Keep diffs focused.
 ## Definition of Done
 - The change preserves the repository intent.
 - Relevant docs are updated.
 - Changed Bash scripts pass `bash -n`.
 - Available validation helpers were run.
 - Missing optional tools are reported.
 - Any remaining risk or follow-up is documented.
 ## Do Not
 - Do not add an "ultimate DevOps template" structure.
 - Do not replace working simple Bash with unnecessary abstractions.
 - Do not make examples appear production-certified.
 - Do not add destructive behavior without `--execute`, confirmation, and clear rollback notes.
 - Do not delete useful content unless it is clearly duplicate, broken, or misleading.
@@ -4,6 +4,17 @@
 ### Added
 - Repository-level Codex guidance:
  - `AGENTS.md`
  - `docs/codex/README.md`
  - `docs/codex/review-checklist.md`
  - `docs/codex/task-template.md`
  - `docs/codex/plans-template.md`
 - Lightweight validation helpers:
  - `scripts/validate-repo.sh`
  - `scripts/check-bash.sh`
  - `scripts/check-ansible.sh`
  - `scripts/check-docs.sh`
 - Cross-repository operational documentation structure:
  - `infra-run/docs/operations-cheatsheet.md`
  - `platform-projects/docs/platform-cheatsheet.md`
@@ -19,6 +30,7 @@
 ### Changed
 - Updated root, `infra-run`, Bash, Ansible, platform, and lab README guidance for safety-first usage, validation, and future Codex-driven work.
 - Updated repository and `infra-run` README files to surface the new documentation structure and operational cheatsheets.
 - Updated repository, `infra-run`, and Ansible README files to describe the new hardening automation instead of placeholder-only Ansible structure.
@@ -1,19 +1,41 @@
-# Portfolio
+# Linux/Unix Infrastructure Engineering Portfolio
-This repository contains sanitized infrastructure automation examples based on Linux operations and enterprise infrastructure workflows. The focus is on precheck, dry-run, controlled execution, postcheck, troubleshooting, and clear operational reporting.
+This repository contains sanitized infrastructure automation examples based on Linux/Unix operations and infrastructure workflows. The focus is on incident response, troubleshooting, pre-checks, dry-run behavior, controlled execution, post-checks, and readable operational evidence.
-It is a technical portfolio, not a production toolkit. The examples are meant to show how I structure operational work: understand the current state, make changes only with explicit controls, verify the result, and leave readable evidence for review.
+It is a technical portfolio, not a production toolkit. The examples show how operational work is structured: understand the current state, make changes only with explicit controls, verify the result, and leave enough evidence for review.
-## What Is Usable Now
+## What This Repo Is
- [infra-run](./infra-run/) - the main project in this repository.
+- Practical Linux/Unix operations examples.
 - Safe Bash and Ansible patterns for lab and review.
 - Runbook-driven examples for incident response, storage operations, hardening, and observability.
 - A place for platform and lab topics to grow without pretending unfinished areas are complete.
 ## What This Repo Is Not
 - It is not a compliance benchmark implementation.
 - It is not a drop-in change automation framework.
 - It is not proof that these exact scripts ran in any production environment.
 - It does not replace change review, peer review, backups, monitoring, or platform-specific runbooks.
 ## Repository Layout
 - [infra-run](./infra-run/) - core operational tooling and automation.
 - [platform-projects](./platform-projects/) - larger platform topics and case-study areas.
 - [labs](./labs/) - experimental/lab environments and notes.
 - [docs/codex](./docs/codex/) - guidance for future Codex-driven changes.
 - [scripts](./scripts/) - lightweight repository validation helpers.
 ## Usable Now
 - [infra-run](./infra-run/) - the main implemented project in this repository.
 - [Linux healthcheck scripts](./infra-run/scripts/bash/os-healthcheck/) - host, disk, service, network, and report helpers.
 - [Disk full workflow](./infra-run/scripts/bash/disk-full/) - triage scripts for usage, inode pressure, deleted open files, large files, log cleanup review, and postchecks.
 - [Veritas examples](./infra-run/scripts/bash/veritas/) - dry-run-first VxVM/VCS storage expansion workflow examples.
 - [GPFS examples](./infra-run/scripts/bash/gpfs/) - dry-run-first IBM Spectrum Scale expansion workflow examples.
 - [Ansible hardening examples](./infra-run/ansible/) - selected Linux and AIX baseline hardening tasks organized as lab-safe roles.
-## What Is Planned
+## Planned Areas
 The `labs` and `platform-projects` trees are intentionally thin. They are kept as planning areas for future lab notes and case studies, not as completed projects. Current planned topics are tracked in [ROADMAP.md](./ROADMAP.md).
@@ -31,28 +53,41 @@ The `labs` and `platform-projects` trees are intentionally thin. They are kept a
 - [labs/docs/lab-cheatsheet.md](./labs/docs/lab-cheatsheet.md) - quick-reference scratchpad for K3s, Proxmox, Terraform, Docker, networking, and short-lived lab troubleshooting work.
-## What This Repo Is Not
+### Codex and Review Guidance
- It is not a compliance benchmark implementation.
+- [AGENTS.md](./AGENTS.md) - repository rules for automated and assisted changes.
- It is not a drop-in change automation framework.
+- [docs/codex/README.md](./docs/codex/README.md) - Codex workflow and expected final response format.
- It is not proof that these exact scripts ran in any production environment.
+- [docs/codex/review-checklist.md](./docs/codex/review-checklist.md) - safety, Bash, Ansible, docs, and validation review checklist.
- It does not replace change review, peer review, backups, monitoring, or platform-specific runbooks.
+- [docs/codex/task-template.md](./docs/codex/task-template.md) - reusable scoped task templates.
 ## Safety-First Usage
 Read scripts and playbooks before running them. Operational examples are sanitized and may need adaptation for a real system.
 - Prefer read-only commands first.
 - Use dry-run/check mode before execution.
 - Treat `--execute` as a change-control boundary.
 - Confirm backups, monitoring, application impact, and rollback steps before live use.
 - Do not run platform-specific storage commands without a matching Veritas, GPFS, or AIX lab.
 ## Validation
 Basic local validation:
 ```bash
-find infra-run/scripts/bash -name '*.sh' -print0 | xargs -0 shellcheck -x -P infra-run/scripts/bash/disk-full -P infra-run/scripts/bash/gpfs -P infra-run/scripts/bash/veritas
+./scripts/validate-repo.sh
-yamllint .
+./scripts/check-bash.sh
-cd infra-run/ansible && ansible-lint playbooks roles
+./scripts/check-ansible.sh
 ./scripts/check-docs.sh
 ```
 The validation helpers run required lightweight checks and use optional tools such as `shellcheck`, `yamllint`, `ansible-playbook`, `ansible-lint`, and `markdownlint` when available. Set `STRICT=1` to fail when optional tools are missing.
 Some scripts depend on platform tools such as `vxdisk`, `hagrp`, `mmcrnsd`, and `mmlscluster`. Those commands are not expected to exist on a normal workstation, so functional testing against Veritas or GPFS requires a real lab environment.
 See [infra-run/TESTED.md](./infra-run/TESTED.md) and [infra-run/KNOWN_LIMITATIONS.md](./infra-run/KNOWN_LIMITATIONS.md) for the current validation status.
-## Skills Demonstrated
+## Operational Areas Demonstrated
 - Linux operations triage and reporting.
 - Disk pressure and deleted-file incident analysis.
@@ -0,0 +1,55 @@
 # Codex Workflow
 This directory keeps future Codex sessions consistent when working in this infrastructure portfolio.
 ## How To Start
 1. Read [AGENTS.md](../../AGENTS.md).
 2. Inspect the affected tree and nearby README files.
 3. Check `git status --short` so existing user work is preserved.
 4. Decide whether a plan is needed before editing.
 5. Make small, reviewable changes.
 6. Run focused validation plus `./scripts/validate-repo.sh` when practical.
 ## When To Plan First
 Plan before editing when a task touches more than one subsystem, changes operational behavior, adds or modifies destructive actions, changes Ansible targeting, or updates repository conventions.
 For small typo fixes, narrow README updates, or obvious syntax fixes, inspect first and then make the change directly.
 Use [plans-template.md](./plans-template.md) for larger changes.
 ## Scoped Tasks
 Good tasks name the operational goal, affected directories, constraints, validation commands, and what "done" means. Use [task-template.md](./task-template.md) for reusable prompts.
 Keep scope tied to real operations:
 - Bash tool: discovery, pre-check, dry-run, execute, post-check, report.
 - Ansible change: inventory target, role/playbook scope, check mode, idempotency, validation.
 - Runbook: incident signal, triage, decision points, rollback, evidence.
 - Lab/platform project: status, prerequisites, validation, limitations.
 ## Validation
 Prefer the repository helpers:
 ```bash
 ./scripts/check-bash.sh
 ./scripts/check-ansible.sh
 ./scripts/check-docs.sh
 ./scripts/validate-repo.sh
 ```
 If optional tools are missing, report that clearly and continue with available checks. Do not claim skipped checks passed.
 ## Final Response Format
 End with:
 1. Summary of what changed.
 2. Files created or modified.
 3. Validation commands run and results.
 4. Skipped checks and why.
 5. Risks or follow-ups.
 6. Whether the repo is ready for future Codex-driven work.
@@ -0,0 +1,35 @@
 # Implementation Plan Template
 Use this for changes that touch multiple files, alter operational behavior, or add new repository conventions.
 ## Goal
 State the operational or maintenance outcome.
 ## Current State
 Summarize the directories and conventions inspected.
 ## Scope
 List files or directories expected to change.
 ## Non-Goals
 Name what will not be redesigned, renamed, deleted, or claimed as complete.
 ## Plan
 1. Inspect relevant scripts, playbooks, docs, and examples.
 2. Make the smallest structural or documentation changes needed.
 3. Update validation or runbook guidance.
 4. Run focused checks.
 5. Summarize residual risk and follow-ups.
 ## Validation
 List commands to run, including fallback behavior for missing tools.
 ## Risks
 Call out destructive operations, platform assumptions, missing lab environments, or checks that require real systems.
@@ -0,0 +1,52 @@
 # Review Checklist
 Use this checklist for repository reviews and pull requests.
 ## Safety
 - Destructive actions default to dry-run or read-only.
 - Real changes require explicit `--execute` and operator confirmation.
 - Inputs are validated before use.
 - Paths, service names, disks, volumes, and inventory targets are constrained.
 - Rollback or recovery thinking is documented where the operation can change state.
 ## Bash
 - Uses `#!/usr/bin/env bash`.
 - Uses `set -o errexit`, `set -o nounset`, and `set -o pipefail`.
 - Missing commands return a clear warning or invalid-input/dependency exit.
 - Output uses `OK`, `WARNING`, and `CRITICAL` consistently.
 - Exit codes follow repo convention: `0` OK, `1` operational issue, `2` invalid input or missing dependency.
 - Help output exists for scripts that accept arguments.
 ## Ansible
 - Target hosts are explicit and appropriate for the role.
 - Modules are preferred over `shell` or `command`.
 - Check mode and diff mode are considered.
 - Tasks are idempotent or clearly documented when a check is inherently read-only or platform-specific.
 - Handlers, tags, defaults, and validation tasks are used where useful.
 - Inventory, vars, and role defaults do not contain secrets or real environment data.
 ## Documentation
 - README files explain current state without overstating completeness.
 - Runbooks include scope, pre-checks, execution controls, post-checks, and evidence.
 - Docs avoid tutorial filler and fake enterprise complexity.
 - Important limitations are linked or documented.
 - `CHANGELOG.md` is updated for meaningful repo changes.
 ## Operational Realism
 - The change reflects RHEL/Oracle Linux, Debian/Ubuntu, AIX, Veritas, GPFS, Zabbix, ELK, Docker, Kubernetes/K3s, Terraform, VMware, or Proxmox operations accurately.
 - Examples remain sanitized.
 - Placeholder projects are identified as placeholders.
 - There is no unnecessary abstraction or invented complexity.
 ## Validation
 - Changed Bash scripts pass `bash -n`.
 - `shellcheck` was run if available, or its absence was reported.
 - Ansible syntax/lint checks were run if available and relevant.
 - YAML/Markdown sanity checks were run if available.
 - Failures and skipped checks are visible in the final summary.
@@ -0,0 +1,276 @@
 # Task Templates
 Copy the relevant section into a future Codex request and fill in the blanks.
 ## Operational Bash Tool
 ### Goal
 Build or improve a Bash tool for:
 ### Context
 Affected platform, incident, or operational workflow:
 ### Constraints
 - Default to dry-run/read-only.
 - Require `--execute` for changes.
 - Use `OK`, `WARNING`, and `CRITICAL`.
 - Exit `0` OK, `1` operational issue, `2` invalid input or missing dependency.
 ### Files/directories to inspect
 - `infra-run/scripts/bash/`
 - Relevant runbook or README:
 ### Implementation steps
 1. Inspect neighboring scripts and shared helpers.
 2. Add or adjust usage/help output.
 3. Add discovery, pre-check, guarded change, post-check, and reporting sections where useful.
 4. Update README or runbook notes.
 ### Validation commands
 ```bash
 bash -n <script>
 ./scripts/check-bash.sh
 ```
 ### Done when
 The tool is readable, safe by default, validates inputs, reports clearly, and has updated docs.
 ## Ansible Playbook/Role
 ### Goal
 Add or improve Ansible automation for:
 ### Context
 Target OS and inventory group:
 ### Constraints
 - Preserve check-mode friendliness.
 - Prefer modules over shell/command.
 - Keep playbooks short.
 - Keep role defaults sanitized.
 ### Files/directories to inspect
 - `infra-run/ansible/README.md`
 - `infra-run/ansible/inventory/`
 - `infra-run/ansible/playbooks/`
 - `infra-run/ansible/roles/`
 ### Implementation steps
 1. Inspect existing role/playbook patterns.
 2. Add defaults, tasks, handlers, and tags only where needed.
 3. Add validation or post-check tasks for operational evidence.
 4. Update role/playbook README.
 ### Validation commands
 ```bash
 ./scripts/check-ansible.sh
 cd infra-run/ansible && ansible-playbook --syntax-check -i inventory/hosts.yml playbooks/<playbook>.yml
 ```
 ### Done when
 The playbook targets the right hosts, is idempotent where practical, supports review with `--check --diff`, and docs explain limitations.
 ## Runbook
 ### Goal
 Create or improve a runbook for:
 ### Context
 Incident signal, platform, and affected service:
 ### Constraints
 - Include pre-checks, decision points, rollback, post-checks, and evidence.
 - Avoid pretending lab notes are production-certified.
 ### Files/directories to inspect
 - `infra-run/runbooks/`
 - `infra-run/docs/`
 - Related scripts/examples:
 ### Implementation steps
 1. Define scope and assumptions.
 2. Add triage steps and command examples.
 3. Add safe execution gates.
 4. Add validation and handoff notes.
 ### Validation commands
 ```bash
 ./scripts/check-docs.sh
 ```
 ### Done when
 An operator can follow the runbook without guessing the risk, inputs, or success criteria.
 ## Lab Scenario
 ### Goal
 Add or improve a lab scenario for:
 ### Context
 Technology and local environment:
 ### Constraints
 - Mark lab-only behavior clearly.
 - Keep prerequisites and cleanup explicit.
 ### Files/directories to inspect
 - `labs/`
 - `labs/docs/lab-cheatsheet.md`
 ### Implementation steps
 1. Document prerequisites and topology.
 2. Add setup, validation, failure injection if relevant, and cleanup.
 3. Link related scripts or runbooks.
 ### Validation commands
 ```bash
 ./scripts/check-docs.sh
 ```
 ### Done when
 The lab is reproducible enough to review and does not imply production readiness.
 ## Platform Project
 ### Goal
 Add or improve a platform project for:
 ### Context
 Monitoring, storage, clustering, virtualization, observability, or related topic:
 ### Constraints
 - Keep status honest: planned, partial, lab-tested, or complete.
 - Prefer operational notes over marketing language.
 ### Files/directories to inspect
 - `platform-projects/`
 - `platform-projects/docs/platform-cheatsheet.md`
 ### Implementation steps
 1. Identify scope and current maturity.
 2. Add design notes, operational workflows, and validation.
 3. Link runbooks, examples, and known limitations.
 ### Validation commands
 ```bash
 ./scripts/check-docs.sh
 ```
 ### Done when
 The project explains what exists, how to validate it, and what remains unproven.
 ## Documentation Cleanup
 ### Goal
 Clean up documentation for:
 ### Context
 Current confusion, duplication, or missing links:
 ### Constraints
 - Preserve useful operational detail.
 - Avoid tutorial-style filler.
 ### Files/directories to inspect
 - Root `README.md`
 - Section README files
 - Related docs/runbooks:
 ### Implementation steps
 1. Remove duplication where it hurts navigation.
 2. Add links to canonical docs.
 3. Make limitations explicit.
 4. Update changelog if meaningful.
 ### Validation commands
 ```bash
 ./scripts/check-docs.sh
 ```
 ### Done when
 Readers can find the right tool, runbook, or validation command quickly.
 ## Repository Review
 ### Goal
 Review repository quality for:
 ### Context
 Areas of concern:
 ### Constraints
 - Findings first, ordered by severity.
 - Include file/line references where possible.
 - Do not rewrite unrelated content.
 ### Files/directories to inspect
 - `AGENTS.md`
 - `README.md`
 - `infra-run/`
 - `platform-projects/`
 - `labs/`
 - `scripts/`
 ### Implementation steps
 1. Inspect structure and conventions.
 2. Review safety, validation, docs, and maintainability.
 3. Patch only low-risk issues if requested.
 4. Report risks and follow-ups.
 ### Validation commands
 ```bash
 ./scripts/validate-repo.sh
 git diff --stat
 ```
 ### Done when
 The review identifies practical risks and leaves a clear next action list.
@@ -38,6 +38,14 @@ The goal is to show operational judgment, not to ship a universal automation pro
 - Disk-full read-only scripts can be run against local paths for basic behavior checks.
 - Ansible YAML and role structure can be linted locally.
 ## Running Safely
 - Start with the relevant README or runbook before executing a script.
 - Prefer read-only discovery scripts before remediation scripts.
 - Use dry-run mode unless a script explicitly documents safe local behavior.
 - Only use `--execute` after reviewing inputs, affected systems, rollback options, and post-checks.
 - For Ansible, start with `--check --diff` against a lab inventory.
 ## Lab-Safe Examples
 - Veritas and GPFS scripts default to dry-run behavior where they plan destructive or platform-changing operations.
@@ -59,12 +67,10 @@ Short version:
 From the repository root:
 ```bash
-find infra-run/scripts/bash -name '*.sh' -print0 | xargs -0 shellcheck -x -P infra-run/scripts/bash/disk-full -P infra-run/scripts/bash/gpfs -P infra-run/scripts/bash/veritas
+./scripts/validate-repo.sh
 yamllint .
 cd infra-run/ansible && ansible-lint playbooks roles
 ```
-If `ansible-lint` reports collection-related issues, install the collections listed in [ansible/collections/requirements.yml](./ansible/collections/requirements.yml) and rerun it. Treat lint as a starting point; platform testing still requires actual target systems.
+Focused checks are available in `scripts/check-bash.sh`, `scripts/check-ansible.sh`, and `scripts/check-docs.sh`. If `ansible-lint` reports collection-related issues, install the collections listed in [ansible/collections/requirements.yml](./ansible/collections/requirements.yml) and rerun it. Treat lint as a starting point; platform testing still requires actual target systems.
 ## Supporting Notes
@@ -72,3 +78,4 @@ If `ansible-lint` reports collection-related issues, install the collections lis
 - [TESTED.md](./TESTED.md) lists what was checked locally and what was not.
 - [KNOWN_LIMITATIONS.md](./KNOWN_LIMITATIONS.md) documents technical limits and operational cautions.
 - [ROADMAP.md](./ROADMAP.md) tracks planned additions without presenting them as completed work.
 - [../AGENTS.md](../AGENTS.md) and [../docs/codex](../docs/codex/) document repository working rules and review expectations.
@@ -34,3 +34,5 @@ flowchart TD
 - Roles are selected baseline examples intended for portfolio and lab use, not a drop-in compliance certification.
 - Defaults are sanitized and configurable through inventory or `--extra-vars`.
 - Run platform-specific playbooks against appropriate test hosts before adapting them to managed environments.
 - Prefer `--check --diff` for review runs before applying changes.
 - Validate from the repository root with `./scripts/check-ansible.sh`.
@@ -21,3 +21,4 @@ flowchart TD
 - The repository currently emphasizes Bash because it maps directly to day-to-day Linux operations.
 - The structure leaves room for higher-level helpers without mixing concerns.
 - Bash tooling should remain safe by default, readable, and validated with `../../scripts/check-bash.sh` from the repository root.
@@ -39,6 +39,14 @@ cd infra-run/scripts/bash/os-healthcheck
 ./network_troubleshoot.sh google.com
 ```
 ## Standards
 - Scripts use Bash and should keep `#!/usr/bin/env bash` plus strict mode.
 - Read-only checks should report missing tools without hiding the problem.
 - Change-capable scripts must default to dry-run behavior and require explicit `--execute`.
 - Output should use `OK`, `WARNING`, and `CRITICAL` where practical.
 - Validate changed scripts with `./scripts/check-bash.sh` from the repository root.
 ## Exit Codes
 `disk_check.sh`:
@@ -1,5 +1,15 @@
 # labs
-This directory is reserved for future lab work. The current focus of the repository is [infra-run](../infra-run/).
+This directory is reserved for experimental and lab-only infrastructure work. The current focus of the repository is [infra-run](../infra-run/).
-Planned lab topics are tracked in [ROADMAP.md](../ROADMAP.md). Subdirectories are placeholders only and should not be treated as completed projects.
+Current subdirectories are planning areas unless their own README documents a runnable scenario:
 - `kubernetes`
 - `terraform`
 - `networking`
 - `ci-cd`
 - `docker`
 Lab content should document prerequisites, topology, validation, cleanup, and what remains untested. Do not present lab behavior as production-ready.
 Planned lab topics are tracked in [ROADMAP.md](../ROADMAP.md). For Codex-driven changes, use [AGENTS.md](../AGENTS.md) and the templates under [docs/codex](../docs/codex/).
@@ -1,5 +1,15 @@
 # platform-projects
-This directory is reserved for future platform case studies. The current implemented project is [infra-run](../infra-run/).
+This directory is reserved for larger infrastructure platform topics and future case studies. The current implemented project is [infra-run](../infra-run/).
-Planned platform topics are tracked in [ROADMAP.md](../ROADMAP.md). Subdirectories are placeholders only and should not be treated as completed work.
+Current subdirectories are intentionally light and should be read as planning areas unless their own README says otherwise:
 - `monitoring-zabbix`
 - `elk-log-analysis`
 - `storage`
 - `clustering`
 - `virtualization`
 Planned platform topics are tracked in [ROADMAP.md](../ROADMAP.md). Keep future additions operational: scope, topology, validation, limitations, and runbook links should matter more than diagrams or buzzwords.
 For Codex-driven changes, use [AGENTS.md](../AGENTS.md) and the templates under [docs/codex](../docs/codex/).
@@ -0,0 +1,95 @@
 #!/usr/bin/env bash
 set -o errexit
 set -o nounset
 set -o pipefail
 STRICT="${STRICT:-0}"
 ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
 ANSIBLE_DIR="$ROOT_DIR/infra-run/ansible"
 ok_count=0
 warn_count=0
 fail_count=0
 ok() {
  printf 'OK: %s\n' "$*"
  ok_count=$((ok_count + 1))
 }
 warning() {
  printf 'WARNING: %s\n' "$*"
  warn_count=$((warn_count + 1))
 }
 critical() {
  printf 'CRITICAL: %s\n' "$*"
  fail_count=$((fail_count + 1))
 }
 if [[ ! -d "$ANSIBLE_DIR" ]]; then
  warning "No infra-run/ansible directory found"
  printf '\nAnsible summary: %d OK, %d WARNING, %d CRITICAL\n' "$ok_count" "$warn_count" "$fail_count"
  exit 0
 fi
 mapfile -t yaml_files < <(find "$ANSIBLE_DIR" -type f \( -name '*.yml' -o -name '*.yaml' \) -print | sort)
 if ((${#yaml_files[@]} == 0)); then
  warning "No Ansible YAML files found"
 else
  ok "Found ${#yaml_files[@]} Ansible YAML files"
 fi
 if command -v ansible-playbook >/dev/null 2>&1; then
  while IFS= read -r playbook; do
    [[ -n "$playbook" ]] || continue
    playbook_rel="${playbook#"$ANSIBLE_DIR"/}"
    if (cd "$ANSIBLE_DIR" && ansible-playbook --syntax-check -i inventory/hosts.yml "$playbook_rel"); then
      ok "ansible syntax $playbook_rel"
    else
      critical "ansible syntax failed $playbook_rel"
    fi
  done < <(find "$ANSIBLE_DIR/playbooks" -type f \( -name '*.yml' -o -name '*.yaml' \) -print | sort)
 else
  if [[ "$STRICT" == "1" ]]; then
    critical "ansible-playbook not installed"
  else
    warning "ansible-playbook not installed; skipped syntax checks"
  fi
 fi
 if command -v ansible-lint >/dev/null 2>&1; then
  if (cd "$ANSIBLE_DIR" && ansible-lint playbooks roles); then
    ok "ansible-lint"
  else
    critical "ansible-lint reported issues"
  fi
 else
  if [[ "$STRICT" == "1" ]]; then
    critical "ansible-lint not installed"
  else
    warning "ansible-lint not installed; skipped optional lint"
  fi
 fi
 if command -v yamllint >/dev/null 2>&1; then
  if yamllint "$ANSIBLE_DIR"; then
    ok "yamllint infra-run/ansible"
  else
    critical "yamllint reported issues in infra-run/ansible"
  fi
 else
  if [[ "$STRICT" == "1" ]]; then
    critical "yamllint not installed"
  else
    warning "yamllint not installed; skipped optional YAML lint"
  fi
 fi
 printf '\nAnsible summary: %d OK, %d WARNING, %d CRITICAL\n' "$ok_count" "$warn_count" "$fail_count"
 if ((fail_count > 0)); then
  exit 1
 fi
 exit 0
@@ -0,0 +1,76 @@
 #!/usr/bin/env bash
 set -o errexit
 set -o nounset
 set -o pipefail
 STRICT="${STRICT:-0}"
 ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
 ok_count=0
 warn_count=0
 fail_count=0
 ok() {
  printf 'OK: %s\n' "$*"
  ok_count=$((ok_count + 1))
 }
 warning() {
  printf 'WARNING: %s\n' "$*"
  warn_count=$((warn_count + 1))
 }
 critical() {
  printf 'CRITICAL: %s\n' "$*"
  fail_count=$((fail_count + 1))
 }
 mapfile -t bash_files < <(find "$ROOT_DIR" -path "$ROOT_DIR/.git" -prune -o -type f -name '*.sh' -print | sort)
 if ((${#bash_files[@]} == 0)); then
  warning "No Bash scripts found"
 else
  for file in "${bash_files[@]}"; do
    if bash -n "$file"; then
      ok "bash -n ${file#"$ROOT_DIR"/}"
    else
      critical "bash syntax failed: ${file#"$ROOT_DIR"/}"
    fi
    first_line="$(sed -n '1p' "$file")"
    if [[ "$first_line" != '#!/usr/bin/env bash' ]]; then
      warning "Non-standard shebang in ${file#"$ROOT_DIR"/}"
    fi
    if ! grep -Eq 'set -o errexit|set -euo pipefail|set -eu|set -e' "$file"; then
      warning "No errexit-style strict mode detected in ${file#"$ROOT_DIR"/}"
    fi
  done
 fi
 if command -v shellcheck >/dev/null 2>&1; then
  if shellcheck -x \
    -e SC1091 \
    -P "$ROOT_DIR/infra-run/scripts/bash/disk-full" \
    -P "$ROOT_DIR/infra-run/scripts/bash/gpfs" \
    -P "$ROOT_DIR/infra-run/scripts/bash/veritas" \
    "${bash_files[@]}"; then
    ok "shellcheck"
  else
    critical "shellcheck reported issues"
  fi
 else
  if [[ "$STRICT" == "1" ]]; then
    critical "shellcheck not installed"
  else
    warning "shellcheck not installed; skipped optional lint"
  fi
 fi
 printf '\nBash summary: %d OK, %d WARNING, %d CRITICAL\n' "$ok_count" "$warn_count" "$fail_count"
 if ((fail_count > 0)); then
  exit 1
 fi
 exit 0
@@ -0,0 +1,88 @@
 #!/usr/bin/env bash
 set -o errexit
 set -o nounset
 set -o pipefail
 STRICT="${STRICT:-0}"
 ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
 ok_count=0
 warn_count=0
 fail_count=0
 ok() {
  printf 'OK: %s\n' "$*"
  ok_count=$((ok_count + 1))
 }
 warning() {
  printf 'WARNING: %s\n' "$*"
  warn_count=$((warn_count + 1))
 }
 critical() {
  printf 'CRITICAL: %s\n' "$*"
  fail_count=$((fail_count + 1))
 }
 mapfile -t markdown_files < <(find "$ROOT_DIR" -path "$ROOT_DIR/.git" -prune -o -type f -name '*.md' -print | sort)
 if ((${#markdown_files[@]} == 0)); then
  warning "No Markdown files found"
 else
  ok "Found ${#markdown_files[@]} Markdown files"
 fi
 missing_links=0
 while IFS= read -r link; do
  [[ -n "$link" ]] || continue
  file="${link%%:*}"
  target="${link#*:}"
  [[ "$target" == http://* || "$target" == https://* || "$target" == mailto:* || "$target" == \#* ]] && continue
  target="${target%%#*}"
  [[ -n "$target" ]] || continue
  base_dir="$(dirname "$file")"
  if [[ ! -e "$base_dir/$target" ]]; then
    critical "Broken local Markdown link in ${file#"$ROOT_DIR"/}: $target"
    missing_links=$((missing_links + 1))
  fi
 done < <(
  for file in "${markdown_files[@]}"; do
    grep -Eo '\[[^]]+\]\([^)]+\)' "$file" \
      | sed -E 's/.*\]\(([^)]+)\).*/'"${file//\//\\/}"':\1/' || true
  done
 )
 if ((missing_links == 0)); then
  ok "No obvious broken local Markdown links"
 fi
 if command -v markdownlint >/dev/null 2>&1; then
  if markdownlint "${markdown_files[@]}"; then
    ok "markdownlint"
  else
    critical "markdownlint reported issues"
  fi
 elif command -v markdownlint-cli2 >/dev/null 2>&1; then
  if markdownlint-cli2 "${markdown_files[@]}"; then
    ok "markdownlint-cli2"
  else
    critical "markdownlint-cli2 reported issues"
  fi
 else
  if [[ "$STRICT" == "1" ]]; then
    critical "markdownlint not installed"
  else
    warning "markdownlint not installed; skipped optional Markdown lint"
  fi
 fi
 printf '\nDocs summary: %d OK, %d WARNING, %d CRITICAL\n' "$ok_count" "$warn_count" "$fail_count"
 if ((fail_count > 0)); then
  exit 1
 fi
 exit 0
@@ -0,0 +1,34 @@
 #!/usr/bin/env bash
 set -o errexit
 set -o nounset
 set -o pipefail
 ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
 status=0
 run_check() {
  local name="$1"
  shift
  printf '\n== %s ==\n' "$name"
  if "$@"; then
    printf 'OK: %s completed\n' "$name"
  else
    printf 'CRITICAL: %s failed\n' "$name"
    status=1
  fi
 }
 run_check "Bash" "$ROOT_DIR/scripts/check-bash.sh"
 run_check "Ansible" "$ROOT_DIR/scripts/check-ansible.sh"
 run_check "Docs" "$ROOT_DIR/scripts/check-docs.sh"
 printf '\n== Repository summary ==\n'
 if ((status == 0)); then
  printf 'OK: repository validation completed with no critical failures\n'
 else
  printf 'CRITICAL: one or more validation checks failed\n'
 fi
 exit "$status"