Add Codex repository guidance and validation
lint / shell-yaml-ansible (push) Failing after 17s

This commit is contained in:
Mateusz Suski
2026-05-10 11:11:03 +00:00
parent 0d3905b8a1
commit a527022518
17 changed files with 935 additions and 23 deletions
+116
View File
@@ -0,0 +1,116 @@
# AGENTS.md
Guidance for Codex and other automated agents working in this repository.
## Purpose
This repository is a Linux/Unix infrastructure engineering portfolio. It shows practical operational work: incident response, troubleshooting, safe Bash tooling, Ansible hardening examples, storage workflows, runbooks, and platform/lab notes.
Treat it like internal operations tooling maintained by an infrastructure engineer. Preserve operational realism and avoid generic tutorial or template filler.
## Layout
- `infra-run/` - core operational tooling, Ansible, Bash scripts, runbooks, examples, and operations docs.
- `platform-projects/` - larger platform topics such as monitoring, storage, clustering, virtualization, and observability.
- `labs/` - experimental/lab environments for Kubernetes, Terraform, networking, CI/CD, Docker, and related work.
- `docs/codex/` - Codex workflow guidance, task templates, review checklist, and planning template.
- `scripts/` - repository validation helpers.
## Inspect First
Before editing, inspect the affected tree and nearby README files. Prefer:
```bash
rg --files
git status --short
sed -n '1,220p' <file>
```
Check existing style before introducing new structure. Keep changes small and reviewable.
## Validation
Run the broad repo check when practical:
```bash
./scripts/validate-repo.sh
```
Focused checks:
```bash
./scripts/check-bash.sh
./scripts/check-ansible.sh
./scripts/check-docs.sh
```
Optional strict mode fails when optional tools are missing:
```bash
STRICT=1 ./scripts/validate-repo.sh
```
Also run targeted checks for changed files, such as `bash -n`, `ansible-playbook --syntax-check`, or link checks when relevant.
## Bash Standards
- Use `#!/usr/bin/env bash`.
- Use `set -o errexit`, `set -o nounset`, and `set -o pipefail`.
- Validate input before using it.
- Handle missing commands clearly.
- Default to read-only or dry-run behavior.
- Require explicit `--execute` plus confirmation for destructive operations.
- Use clear `OK`, `WARNING`, and `CRITICAL` output.
- Exit codes: `0` OK, `1` operational issue, `2` invalid input or missing dependency.
- Keep scripts readable; separate discovery, pre-check, change, post-check, and reporting when it helps.
## Ansible Standards
- Keep playbooks short and roles simple.
- Prefer modules over `shell` or `command`.
- Use `shell` or `command` only when the module set cannot express the operation, and document why if risk is not obvious.
- Preserve check-mode and diff-mode friendliness where possible.
- Use handlers, tags, defaults, and validation tasks when they clarify operations.
- Keep inventory under `inventory/hosts.yml`, `group_vars/`, and `host_vars/`.
- Do not present selected hardening examples as complete compliance certification.
## Documentation Standards
- Explain what exists, what is planned, and what is intentionally not supported.
- Prefer runbook style: scope, pre-checks, execution guardrails, rollback thinking, post-checks, and evidence.
- Avoid marketing language, fake enterprise wording, and tutorial bloat.
- Update README files and `CHANGELOG.md` when adding meaningful behavior or structure.
## Safety Rules
- Do not run destructive commands.
- Do not rename large directories unless the benefit is clear and low-risk.
- Do not hide validation failures.
- Do not claim live production validation for sanitized examples.
- Do not add secrets, real hostnames, customer identifiers, or private infrastructure details.
- Do not turn placeholders into fake completed projects.
## PR and Review Expectations
- State the operational risk of the change.
- Include commands run and whether tools were missing.
- Review scripts for dry-run behavior, input validation, dependency handling, and rollback path.
- Review Ansible for idempotency, check-mode behavior, inventory targeting, tags, handlers, and module choice.
- Keep diffs focused.
## Definition of Done
- The change preserves the repository intent.
- Relevant docs are updated.
- Changed Bash scripts pass `bash -n`.
- Available validation helpers were run.
- Missing optional tools are reported.
- Any remaining risk or follow-up is documented.
## Do Not
- Do not add an "ultimate DevOps template" structure.
- Do not replace working simple Bash with unnecessary abstractions.
- Do not make examples appear production-certified.
- Do not add destructive behavior without `--execute`, confirmation, and clear rollback notes.
- Do not delete useful content unless it is clearly duplicate, broken, or misleading.
+12
View File
@@ -4,6 +4,17 @@
### Added ### Added
- Repository-level Codex guidance:
- `AGENTS.md`
- `docs/codex/README.md`
- `docs/codex/review-checklist.md`
- `docs/codex/task-template.md`
- `docs/codex/plans-template.md`
- Lightweight validation helpers:
- `scripts/validate-repo.sh`
- `scripts/check-bash.sh`
- `scripts/check-ansible.sh`
- `scripts/check-docs.sh`
- Cross-repository operational documentation structure: - Cross-repository operational documentation structure:
- `infra-run/docs/operations-cheatsheet.md` - `infra-run/docs/operations-cheatsheet.md`
- `platform-projects/docs/platform-cheatsheet.md` - `platform-projects/docs/platform-cheatsheet.md`
@@ -19,6 +30,7 @@
### Changed ### Changed
- Updated root, `infra-run`, Bash, Ansible, platform, and lab README guidance for safety-first usage, validation, and future Codex-driven work.
- Updated repository and `infra-run` README files to surface the new documentation structure and operational cheatsheets. - Updated repository and `infra-run` README files to surface the new documentation structure and operational cheatsheets.
- Updated repository, `infra-run`, and Ansible README files to describe the new hardening automation instead of placeholder-only Ansible structure. - Updated repository, `infra-run`, and Ansible README files to describe the new hardening automation instead of placeholder-only Ansible structure.
+50 -15
View File
@@ -1,19 +1,41 @@
# Portfolio # Linux/Unix Infrastructure Engineering Portfolio
This repository contains sanitized infrastructure automation examples based on Linux operations and enterprise infrastructure workflows. The focus is on precheck, dry-run, controlled execution, postcheck, troubleshooting, and clear operational reporting. This repository contains sanitized infrastructure automation examples based on Linux/Unix operations and infrastructure workflows. The focus is on incident response, troubleshooting, pre-checks, dry-run behavior, controlled execution, post-checks, and readable operational evidence.
It is a technical portfolio, not a production toolkit. The examples are meant to show how I structure operational work: understand the current state, make changes only with explicit controls, verify the result, and leave readable evidence for review. It is a technical portfolio, not a production toolkit. The examples show how operational work is structured: understand the current state, make changes only with explicit controls, verify the result, and leave enough evidence for review.
## What Is Usable Now ## What This Repo Is
- [infra-run](./infra-run/) - the main project in this repository. - Practical Linux/Unix operations examples.
- Safe Bash and Ansible patterns for lab and review.
- Runbook-driven examples for incident response, storage operations, hardening, and observability.
- A place for platform and lab topics to grow without pretending unfinished areas are complete.
## What This Repo Is Not
- It is not a compliance benchmark implementation.
- It is not a drop-in change automation framework.
- It is not proof that these exact scripts ran in any production environment.
- It does not replace change review, peer review, backups, monitoring, or platform-specific runbooks.
## Repository Layout
- [infra-run](./infra-run/) - core operational tooling and automation.
- [platform-projects](./platform-projects/) - larger platform topics and case-study areas.
- [labs](./labs/) - experimental/lab environments and notes.
- [docs/codex](./docs/codex/) - guidance for future Codex-driven changes.
- [scripts](./scripts/) - lightweight repository validation helpers.
## Usable Now
- [infra-run](./infra-run/) - the main implemented project in this repository.
- [Linux healthcheck scripts](./infra-run/scripts/bash/os-healthcheck/) - host, disk, service, network, and report helpers. - [Linux healthcheck scripts](./infra-run/scripts/bash/os-healthcheck/) - host, disk, service, network, and report helpers.
- [Disk full workflow](./infra-run/scripts/bash/disk-full/) - triage scripts for usage, inode pressure, deleted open files, large files, log cleanup review, and postchecks. - [Disk full workflow](./infra-run/scripts/bash/disk-full/) - triage scripts for usage, inode pressure, deleted open files, large files, log cleanup review, and postchecks.
- [Veritas examples](./infra-run/scripts/bash/veritas/) - dry-run-first VxVM/VCS storage expansion workflow examples. - [Veritas examples](./infra-run/scripts/bash/veritas/) - dry-run-first VxVM/VCS storage expansion workflow examples.
- [GPFS examples](./infra-run/scripts/bash/gpfs/) - dry-run-first IBM Spectrum Scale expansion workflow examples. - [GPFS examples](./infra-run/scripts/bash/gpfs/) - dry-run-first IBM Spectrum Scale expansion workflow examples.
- [Ansible hardening examples](./infra-run/ansible/) - selected Linux and AIX baseline hardening tasks organized as lab-safe roles. - [Ansible hardening examples](./infra-run/ansible/) - selected Linux and AIX baseline hardening tasks organized as lab-safe roles.
## What Is Planned ## Planned Areas
The `labs` and `platform-projects` trees are intentionally thin. They are kept as planning areas for future lab notes and case studies, not as completed projects. Current planned topics are tracked in [ROADMAP.md](./ROADMAP.md). The `labs` and `platform-projects` trees are intentionally thin. They are kept as planning areas for future lab notes and case studies, not as completed projects. Current planned topics are tracked in [ROADMAP.md](./ROADMAP.md).
@@ -31,28 +53,41 @@ The `labs` and `platform-projects` trees are intentionally thin. They are kept a
- [labs/docs/lab-cheatsheet.md](./labs/docs/lab-cheatsheet.md) - quick-reference scratchpad for K3s, Proxmox, Terraform, Docker, networking, and short-lived lab troubleshooting work. - [labs/docs/lab-cheatsheet.md](./labs/docs/lab-cheatsheet.md) - quick-reference scratchpad for K3s, Proxmox, Terraform, Docker, networking, and short-lived lab troubleshooting work.
## What This Repo Is Not ### Codex and Review Guidance
- It is not a compliance benchmark implementation. - [AGENTS.md](./AGENTS.md) - repository rules for automated and assisted changes.
- It is not a drop-in change automation framework. - [docs/codex/README.md](./docs/codex/README.md) - Codex workflow and expected final response format.
- It is not proof that these exact scripts ran in any production environment. - [docs/codex/review-checklist.md](./docs/codex/review-checklist.md) - safety, Bash, Ansible, docs, and validation review checklist.
- It does not replace change review, peer review, backups, monitoring, or platform-specific runbooks. - [docs/codex/task-template.md](./docs/codex/task-template.md) - reusable scoped task templates.
## Safety-First Usage
Read scripts and playbooks before running them. Operational examples are sanitized and may need adaptation for a real system.
- Prefer read-only commands first.
- Use dry-run/check mode before execution.
- Treat `--execute` as a change-control boundary.
- Confirm backups, monitoring, application impact, and rollback steps before live use.
- Do not run platform-specific storage commands without a matching Veritas, GPFS, or AIX lab.
## Validation ## Validation
Basic local validation: Basic local validation:
```bash ```bash
find infra-run/scripts/bash -name '*.sh' -print0 | xargs -0 shellcheck -x -P infra-run/scripts/bash/disk-full -P infra-run/scripts/bash/gpfs -P infra-run/scripts/bash/veritas ./scripts/validate-repo.sh
yamllint . ./scripts/check-bash.sh
cd infra-run/ansible && ansible-lint playbooks roles ./scripts/check-ansible.sh
./scripts/check-docs.sh
``` ```
The validation helpers run required lightweight checks and use optional tools such as `shellcheck`, `yamllint`, `ansible-playbook`, `ansible-lint`, and `markdownlint` when available. Set `STRICT=1` to fail when optional tools are missing.
Some scripts depend on platform tools such as `vxdisk`, `hagrp`, `mmcrnsd`, and `mmlscluster`. Those commands are not expected to exist on a normal workstation, so functional testing against Veritas or GPFS requires a real lab environment. Some scripts depend on platform tools such as `vxdisk`, `hagrp`, `mmcrnsd`, and `mmlscluster`. Those commands are not expected to exist on a normal workstation, so functional testing against Veritas or GPFS requires a real lab environment.
See [infra-run/TESTED.md](./infra-run/TESTED.md) and [infra-run/KNOWN_LIMITATIONS.md](./infra-run/KNOWN_LIMITATIONS.md) for the current validation status. See [infra-run/TESTED.md](./infra-run/TESTED.md) and [infra-run/KNOWN_LIMITATIONS.md](./infra-run/KNOWN_LIMITATIONS.md) for the current validation status.
## Skills Demonstrated ## Operational Areas Demonstrated
- Linux operations triage and reporting. - Linux operations triage and reporting.
- Disk pressure and deleted-file incident analysis. - Disk pressure and deleted-file incident analysis.
+55
View File
@@ -0,0 +1,55 @@
# Codex Workflow
This directory keeps future Codex sessions consistent when working in this infrastructure portfolio.
## How To Start
1. Read [AGENTS.md](../../AGENTS.md).
2. Inspect the affected tree and nearby README files.
3. Check `git status --short` so existing user work is preserved.
4. Decide whether a plan is needed before editing.
5. Make small, reviewable changes.
6. Run focused validation plus `./scripts/validate-repo.sh` when practical.
## When To Plan First
Plan before editing when a task touches more than one subsystem, changes operational behavior, adds or modifies destructive actions, changes Ansible targeting, or updates repository conventions.
For small typo fixes, narrow README updates, or obvious syntax fixes, inspect first and then make the change directly.
Use [plans-template.md](./plans-template.md) for larger changes.
## Scoped Tasks
Good tasks name the operational goal, affected directories, constraints, validation commands, and what "done" means. Use [task-template.md](./task-template.md) for reusable prompts.
Keep scope tied to real operations:
- Bash tool: discovery, pre-check, dry-run, execute, post-check, report.
- Ansible change: inventory target, role/playbook scope, check mode, idempotency, validation.
- Runbook: incident signal, triage, decision points, rollback, evidence.
- Lab/platform project: status, prerequisites, validation, limitations.
## Validation
Prefer the repository helpers:
```bash
./scripts/check-bash.sh
./scripts/check-ansible.sh
./scripts/check-docs.sh
./scripts/validate-repo.sh
```
If optional tools are missing, report that clearly and continue with available checks. Do not claim skipped checks passed.
## Final Response Format
End with:
1. Summary of what changed.
2. Files created or modified.
3. Validation commands run and results.
4. Skipped checks and why.
5. Risks or follow-ups.
6. Whether the repo is ready for future Codex-driven work.
+35
View File
@@ -0,0 +1,35 @@
# Implementation Plan Template
Use this for changes that touch multiple files, alter operational behavior, or add new repository conventions.
## Goal
State the operational or maintenance outcome.
## Current State
Summarize the directories and conventions inspected.
## Scope
List files or directories expected to change.
## Non-Goals
Name what will not be redesigned, renamed, deleted, or claimed as complete.
## Plan
1. Inspect relevant scripts, playbooks, docs, and examples.
2. Make the smallest structural or documentation changes needed.
3. Update validation or runbook guidance.
4. Run focused checks.
5. Summarize residual risk and follow-ups.
## Validation
List commands to run, including fallback behavior for missing tools.
## Risks
Call out destructive operations, platform assumptions, missing lab environments, or checks that require real systems.
+52
View File
@@ -0,0 +1,52 @@
# Review Checklist
Use this checklist for repository reviews and pull requests.
## Safety
- Destructive actions default to dry-run or read-only.
- Real changes require explicit `--execute` and operator confirmation.
- Inputs are validated before use.
- Paths, service names, disks, volumes, and inventory targets are constrained.
- Rollback or recovery thinking is documented where the operation can change state.
## Bash
- Uses `#!/usr/bin/env bash`.
- Uses `set -o errexit`, `set -o nounset`, and `set -o pipefail`.
- Missing commands return a clear warning or invalid-input/dependency exit.
- Output uses `OK`, `WARNING`, and `CRITICAL` consistently.
- Exit codes follow repo convention: `0` OK, `1` operational issue, `2` invalid input or missing dependency.
- Help output exists for scripts that accept arguments.
## Ansible
- Target hosts are explicit and appropriate for the role.
- Modules are preferred over `shell` or `command`.
- Check mode and diff mode are considered.
- Tasks are idempotent or clearly documented when a check is inherently read-only or platform-specific.
- Handlers, tags, defaults, and validation tasks are used where useful.
- Inventory, vars, and role defaults do not contain secrets or real environment data.
## Documentation
- README files explain current state without overstating completeness.
- Runbooks include scope, pre-checks, execution controls, post-checks, and evidence.
- Docs avoid tutorial filler and fake enterprise complexity.
- Important limitations are linked or documented.
- `CHANGELOG.md` is updated for meaningful repo changes.
## Operational Realism
- The change reflects RHEL/Oracle Linux, Debian/Ubuntu, AIX, Veritas, GPFS, Zabbix, ELK, Docker, Kubernetes/K3s, Terraform, VMware, or Proxmox operations accurately.
- Examples remain sanitized.
- Placeholder projects are identified as placeholders.
- There is no unnecessary abstraction or invented complexity.
## Validation
- Changed Bash scripts pass `bash -n`.
- `shellcheck` was run if available, or its absence was reported.
- Ansible syntax/lint checks were run if available and relevant.
- YAML/Markdown sanity checks were run if available.
- Failures and skipped checks are visible in the final summary.
+276
View File
@@ -0,0 +1,276 @@
# Task Templates
Copy the relevant section into a future Codex request and fill in the blanks.
## Operational Bash Tool
### Goal
Build or improve a Bash tool for:
### Context
Affected platform, incident, or operational workflow:
### Constraints
- Default to dry-run/read-only.
- Require `--execute` for changes.
- Use `OK`, `WARNING`, and `CRITICAL`.
- Exit `0` OK, `1` operational issue, `2` invalid input or missing dependency.
### Files/directories to inspect
- `infra-run/scripts/bash/`
- Relevant runbook or README:
### Implementation steps
1. Inspect neighboring scripts and shared helpers.
2. Add or adjust usage/help output.
3. Add discovery, pre-check, guarded change, post-check, and reporting sections where useful.
4. Update README or runbook notes.
### Validation commands
```bash
bash -n <script>
./scripts/check-bash.sh
```
### Done when
The tool is readable, safe by default, validates inputs, reports clearly, and has updated docs.
## Ansible Playbook/Role
### Goal
Add or improve Ansible automation for:
### Context
Target OS and inventory group:
### Constraints
- Preserve check-mode friendliness.
- Prefer modules over shell/command.
- Keep playbooks short.
- Keep role defaults sanitized.
### Files/directories to inspect
- `infra-run/ansible/README.md`
- `infra-run/ansible/inventory/`
- `infra-run/ansible/playbooks/`
- `infra-run/ansible/roles/`
### Implementation steps
1. Inspect existing role/playbook patterns.
2. Add defaults, tasks, handlers, and tags only where needed.
3. Add validation or post-check tasks for operational evidence.
4. Update role/playbook README.
### Validation commands
```bash
./scripts/check-ansible.sh
cd infra-run/ansible && ansible-playbook --syntax-check -i inventory/hosts.yml playbooks/<playbook>.yml
```
### Done when
The playbook targets the right hosts, is idempotent where practical, supports review with `--check --diff`, and docs explain limitations.
## Runbook
### Goal
Create or improve a runbook for:
### Context
Incident signal, platform, and affected service:
### Constraints
- Include pre-checks, decision points, rollback, post-checks, and evidence.
- Avoid pretending lab notes are production-certified.
### Files/directories to inspect
- `infra-run/runbooks/`
- `infra-run/docs/`
- Related scripts/examples:
### Implementation steps
1. Define scope and assumptions.
2. Add triage steps and command examples.
3. Add safe execution gates.
4. Add validation and handoff notes.
### Validation commands
```bash
./scripts/check-docs.sh
```
### Done when
An operator can follow the runbook without guessing the risk, inputs, or success criteria.
## Lab Scenario
### Goal
Add or improve a lab scenario for:
### Context
Technology and local environment:
### Constraints
- Mark lab-only behavior clearly.
- Keep prerequisites and cleanup explicit.
### Files/directories to inspect
- `labs/`
- `labs/docs/lab-cheatsheet.md`
### Implementation steps
1. Document prerequisites and topology.
2. Add setup, validation, failure injection if relevant, and cleanup.
3. Link related scripts or runbooks.
### Validation commands
```bash
./scripts/check-docs.sh
```
### Done when
The lab is reproducible enough to review and does not imply production readiness.
## Platform Project
### Goal
Add or improve a platform project for:
### Context
Monitoring, storage, clustering, virtualization, observability, or related topic:
### Constraints
- Keep status honest: planned, partial, lab-tested, or complete.
- Prefer operational notes over marketing language.
### Files/directories to inspect
- `platform-projects/`
- `platform-projects/docs/platform-cheatsheet.md`
### Implementation steps
1. Identify scope and current maturity.
2. Add design notes, operational workflows, and validation.
3. Link runbooks, examples, and known limitations.
### Validation commands
```bash
./scripts/check-docs.sh
```
### Done when
The project explains what exists, how to validate it, and what remains unproven.
## Documentation Cleanup
### Goal
Clean up documentation for:
### Context
Current confusion, duplication, or missing links:
### Constraints
- Preserve useful operational detail.
- Avoid tutorial-style filler.
### Files/directories to inspect
- Root `README.md`
- Section README files
- Related docs/runbooks:
### Implementation steps
1. Remove duplication where it hurts navigation.
2. Add links to canonical docs.
3. Make limitations explicit.
4. Update changelog if meaningful.
### Validation commands
```bash
./scripts/check-docs.sh
```
### Done when
Readers can find the right tool, runbook, or validation command quickly.
## Repository Review
### Goal
Review repository quality for:
### Context
Areas of concern:
### Constraints
- Findings first, ordered by severity.
- Include file/line references where possible.
- Do not rewrite unrelated content.
### Files/directories to inspect
- `AGENTS.md`
- `README.md`
- `infra-run/`
- `platform-projects/`
- `labs/`
- `scripts/`
### Implementation steps
1. Inspect structure and conventions.
2. Review safety, validation, docs, and maintainability.
3. Patch only low-risk issues if requested.
4. Report risks and follow-ups.
### Validation commands
```bash
./scripts/validate-repo.sh
git diff --stat
```
### Done when
The review identifies practical risks and leaves a clear next action list.
+11 -4
View File
@@ -38,6 +38,14 @@ The goal is to show operational judgment, not to ship a universal automation pro
- Disk-full read-only scripts can be run against local paths for basic behavior checks. - Disk-full read-only scripts can be run against local paths for basic behavior checks.
- Ansible YAML and role structure can be linted locally. - Ansible YAML and role structure can be linted locally.
## Running Safely
- Start with the relevant README or runbook before executing a script.
- Prefer read-only discovery scripts before remediation scripts.
- Use dry-run mode unless a script explicitly documents safe local behavior.
- Only use `--execute` after reviewing inputs, affected systems, rollback options, and post-checks.
- For Ansible, start with `--check --diff` against a lab inventory.
## Lab-Safe Examples ## Lab-Safe Examples
- Veritas and GPFS scripts default to dry-run behavior where they plan destructive or platform-changing operations. - Veritas and GPFS scripts default to dry-run behavior where they plan destructive or platform-changing operations.
@@ -59,12 +67,10 @@ Short version:
From the repository root: From the repository root:
```bash ```bash
find infra-run/scripts/bash -name '*.sh' -print0 | xargs -0 shellcheck -x -P infra-run/scripts/bash/disk-full -P infra-run/scripts/bash/gpfs -P infra-run/scripts/bash/veritas ./scripts/validate-repo.sh
yamllint .
cd infra-run/ansible && ansible-lint playbooks roles
``` ```
If `ansible-lint` reports collection-related issues, install the collections listed in [ansible/collections/requirements.yml](./ansible/collections/requirements.yml) and rerun it. Treat lint as a starting point; platform testing still requires actual target systems. Focused checks are available in `scripts/check-bash.sh`, `scripts/check-ansible.sh`, and `scripts/check-docs.sh`. If `ansible-lint` reports collection-related issues, install the collections listed in [ansible/collections/requirements.yml](./ansible/collections/requirements.yml) and rerun it. Treat lint as a starting point; platform testing still requires actual target systems.
## Supporting Notes ## Supporting Notes
@@ -72,3 +78,4 @@ If `ansible-lint` reports collection-related issues, install the collections lis
- [TESTED.md](./TESTED.md) lists what was checked locally and what was not. - [TESTED.md](./TESTED.md) lists what was checked locally and what was not.
- [KNOWN_LIMITATIONS.md](./KNOWN_LIMITATIONS.md) documents technical limits and operational cautions. - [KNOWN_LIMITATIONS.md](./KNOWN_LIMITATIONS.md) documents technical limits and operational cautions.
- [ROADMAP.md](./ROADMAP.md) tracks planned additions without presenting them as completed work. - [ROADMAP.md](./ROADMAP.md) tracks planned additions without presenting them as completed work.
- [../AGENTS.md](../AGENTS.md) and [../docs/codex](../docs/codex/) document repository working rules and review expectations.
+2
View File
@@ -34,3 +34,5 @@ flowchart TD
- Roles are selected baseline examples intended for portfolio and lab use, not a drop-in compliance certification. - Roles are selected baseline examples intended for portfolio and lab use, not a drop-in compliance certification.
- Defaults are sanitized and configurable through inventory or `--extra-vars`. - Defaults are sanitized and configurable through inventory or `--extra-vars`.
- Run platform-specific playbooks against appropriate test hosts before adapting them to managed environments. - Run platform-specific playbooks against appropriate test hosts before adapting them to managed environments.
- Prefer `--check --diff` for review runs before applying changes.
- Validate from the repository root with `./scripts/check-ansible.sh`.
+1
View File
@@ -21,3 +21,4 @@ flowchart TD
- The repository currently emphasizes Bash because it maps directly to day-to-day Linux operations. - The repository currently emphasizes Bash because it maps directly to day-to-day Linux operations.
- The structure leaves room for higher-level helpers without mixing concerns. - The structure leaves room for higher-level helpers without mixing concerns.
- Bash tooling should remain safe by default, readable, and validated with `../../scripts/check-bash.sh` from the repository root.
+8
View File
@@ -39,6 +39,14 @@ cd infra-run/scripts/bash/os-healthcheck
./network_troubleshoot.sh google.com ./network_troubleshoot.sh google.com
``` ```
## Standards
- Scripts use Bash and should keep `#!/usr/bin/env bash` plus strict mode.
- Read-only checks should report missing tools without hiding the problem.
- Change-capable scripts must default to dry-run behavior and require explicit `--execute`.
- Output should use `OK`, `WARNING`, and `CRITICAL` where practical.
- Validate changed scripts with `./scripts/check-bash.sh` from the repository root.
## Exit Codes ## Exit Codes
`disk_check.sh`: `disk_check.sh`:
+12 -2
View File
@@ -1,5 +1,15 @@
# labs # labs
This directory is reserved for future lab work. The current focus of the repository is [infra-run](../infra-run/). This directory is reserved for experimental and lab-only infrastructure work. The current focus of the repository is [infra-run](../infra-run/).
Planned lab topics are tracked in [ROADMAP.md](../ROADMAP.md). Subdirectories are placeholders only and should not be treated as completed projects. Current subdirectories are planning areas unless their own README documents a runnable scenario:
- `kubernetes`
- `terraform`
- `networking`
- `ci-cd`
- `docker`
Lab content should document prerequisites, topology, validation, cleanup, and what remains untested. Do not present lab behavior as production-ready.
Planned lab topics are tracked in [ROADMAP.md](../ROADMAP.md). For Codex-driven changes, use [AGENTS.md](../AGENTS.md) and the templates under [docs/codex](../docs/codex/).
+12 -2
View File
@@ -1,5 +1,15 @@
# platform-projects # platform-projects
This directory is reserved for future platform case studies. The current implemented project is [infra-run](../infra-run/). This directory is reserved for larger infrastructure platform topics and future case studies. The current implemented project is [infra-run](../infra-run/).
Planned platform topics are tracked in [ROADMAP.md](../ROADMAP.md). Subdirectories are placeholders only and should not be treated as completed work. Current subdirectories are intentionally light and should be read as planning areas unless their own README says otherwise:
- `monitoring-zabbix`
- `elk-log-analysis`
- `storage`
- `clustering`
- `virtualization`
Planned platform topics are tracked in [ROADMAP.md](../ROADMAP.md). Keep future additions operational: scope, topology, validation, limitations, and runbook links should matter more than diagrams or buzzwords.
For Codex-driven changes, use [AGENTS.md](../AGENTS.md) and the templates under [docs/codex](../docs/codex/).
+95
View File
@@ -0,0 +1,95 @@
#!/usr/bin/env bash
set -o errexit
set -o nounset
set -o pipefail
STRICT="${STRICT:-0}"
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
ANSIBLE_DIR="$ROOT_DIR/infra-run/ansible"
ok_count=0
warn_count=0
fail_count=0
ok() {
printf 'OK: %s\n' "$*"
ok_count=$((ok_count + 1))
}
warning() {
printf 'WARNING: %s\n' "$*"
warn_count=$((warn_count + 1))
}
critical() {
printf 'CRITICAL: %s\n' "$*"
fail_count=$((fail_count + 1))
}
if [[ ! -d "$ANSIBLE_DIR" ]]; then
warning "No infra-run/ansible directory found"
printf '\nAnsible summary: %d OK, %d WARNING, %d CRITICAL\n' "$ok_count" "$warn_count" "$fail_count"
exit 0
fi
mapfile -t yaml_files < <(find "$ANSIBLE_DIR" -type f \( -name '*.yml' -o -name '*.yaml' \) -print | sort)
if ((${#yaml_files[@]} == 0)); then
warning "No Ansible YAML files found"
else
ok "Found ${#yaml_files[@]} Ansible YAML files"
fi
if command -v ansible-playbook >/dev/null 2>&1; then
while IFS= read -r playbook; do
[[ -n "$playbook" ]] || continue
playbook_rel="${playbook#"$ANSIBLE_DIR"/}"
if (cd "$ANSIBLE_DIR" && ansible-playbook --syntax-check -i inventory/hosts.yml "$playbook_rel"); then
ok "ansible syntax $playbook_rel"
else
critical "ansible syntax failed $playbook_rel"
fi
done < <(find "$ANSIBLE_DIR/playbooks" -type f \( -name '*.yml' -o -name '*.yaml' \) -print | sort)
else
if [[ "$STRICT" == "1" ]]; then
critical "ansible-playbook not installed"
else
warning "ansible-playbook not installed; skipped syntax checks"
fi
fi
if command -v ansible-lint >/dev/null 2>&1; then
if (cd "$ANSIBLE_DIR" && ansible-lint playbooks roles); then
ok "ansible-lint"
else
critical "ansible-lint reported issues"
fi
else
if [[ "$STRICT" == "1" ]]; then
critical "ansible-lint not installed"
else
warning "ansible-lint not installed; skipped optional lint"
fi
fi
if command -v yamllint >/dev/null 2>&1; then
if yamllint "$ANSIBLE_DIR"; then
ok "yamllint infra-run/ansible"
else
critical "yamllint reported issues in infra-run/ansible"
fi
else
if [[ "$STRICT" == "1" ]]; then
critical "yamllint not installed"
else
warning "yamllint not installed; skipped optional YAML lint"
fi
fi
printf '\nAnsible summary: %d OK, %d WARNING, %d CRITICAL\n' "$ok_count" "$warn_count" "$fail_count"
if ((fail_count > 0)); then
exit 1
fi
exit 0
+76
View File
@@ -0,0 +1,76 @@
#!/usr/bin/env bash
set -o errexit
set -o nounset
set -o pipefail
STRICT="${STRICT:-0}"
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
ok_count=0
warn_count=0
fail_count=0
ok() {
printf 'OK: %s\n' "$*"
ok_count=$((ok_count + 1))
}
warning() {
printf 'WARNING: %s\n' "$*"
warn_count=$((warn_count + 1))
}
critical() {
printf 'CRITICAL: %s\n' "$*"
fail_count=$((fail_count + 1))
}
mapfile -t bash_files < <(find "$ROOT_DIR" -path "$ROOT_DIR/.git" -prune -o -type f -name '*.sh' -print | sort)
if ((${#bash_files[@]} == 0)); then
warning "No Bash scripts found"
else
for file in "${bash_files[@]}"; do
if bash -n "$file"; then
ok "bash -n ${file#"$ROOT_DIR"/}"
else
critical "bash syntax failed: ${file#"$ROOT_DIR"/}"
fi
first_line="$(sed -n '1p' "$file")"
if [[ "$first_line" != '#!/usr/bin/env bash' ]]; then
warning "Non-standard shebang in ${file#"$ROOT_DIR"/}"
fi
if ! grep -Eq 'set -o errexit|set -euo pipefail|set -eu|set -e' "$file"; then
warning "No errexit-style strict mode detected in ${file#"$ROOT_DIR"/}"
fi
done
fi
if command -v shellcheck >/dev/null 2>&1; then
if shellcheck -x \
-e SC1091 \
-P "$ROOT_DIR/infra-run/scripts/bash/disk-full" \
-P "$ROOT_DIR/infra-run/scripts/bash/gpfs" \
-P "$ROOT_DIR/infra-run/scripts/bash/veritas" \
"${bash_files[@]}"; then
ok "shellcheck"
else
critical "shellcheck reported issues"
fi
else
if [[ "$STRICT" == "1" ]]; then
critical "shellcheck not installed"
else
warning "shellcheck not installed; skipped optional lint"
fi
fi
printf '\nBash summary: %d OK, %d WARNING, %d CRITICAL\n' "$ok_count" "$warn_count" "$fail_count"
if ((fail_count > 0)); then
exit 1
fi
exit 0
+88
View File
@@ -0,0 +1,88 @@
#!/usr/bin/env bash
set -o errexit
set -o nounset
set -o pipefail
STRICT="${STRICT:-0}"
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
ok_count=0
warn_count=0
fail_count=0
ok() {
printf 'OK: %s\n' "$*"
ok_count=$((ok_count + 1))
}
warning() {
printf 'WARNING: %s\n' "$*"
warn_count=$((warn_count + 1))
}
critical() {
printf 'CRITICAL: %s\n' "$*"
fail_count=$((fail_count + 1))
}
mapfile -t markdown_files < <(find "$ROOT_DIR" -path "$ROOT_DIR/.git" -prune -o -type f -name '*.md' -print | sort)
if ((${#markdown_files[@]} == 0)); then
warning "No Markdown files found"
else
ok "Found ${#markdown_files[@]} Markdown files"
fi
missing_links=0
while IFS= read -r link; do
[[ -n "$link" ]] || continue
file="${link%%:*}"
target="${link#*:}"
[[ "$target" == http://* || "$target" == https://* || "$target" == mailto:* || "$target" == \#* ]] && continue
target="${target%%#*}"
[[ -n "$target" ]] || continue
base_dir="$(dirname "$file")"
if [[ ! -e "$base_dir/$target" ]]; then
critical "Broken local Markdown link in ${file#"$ROOT_DIR"/}: $target"
missing_links=$((missing_links + 1))
fi
done < <(
for file in "${markdown_files[@]}"; do
grep -Eo '\[[^]]+\]\([^)]+\)' "$file" \
| sed -E 's/.*\]\(([^)]+)\).*/'"${file//\//\\/}"':\1/' || true
done
)
if ((missing_links == 0)); then
ok "No obvious broken local Markdown links"
fi
if command -v markdownlint >/dev/null 2>&1; then
if markdownlint "${markdown_files[@]}"; then
ok "markdownlint"
else
critical "markdownlint reported issues"
fi
elif command -v markdownlint-cli2 >/dev/null 2>&1; then
if markdownlint-cli2 "${markdown_files[@]}"; then
ok "markdownlint-cli2"
else
critical "markdownlint-cli2 reported issues"
fi
else
if [[ "$STRICT" == "1" ]]; then
critical "markdownlint not installed"
else
warning "markdownlint not installed; skipped optional Markdown lint"
fi
fi
printf '\nDocs summary: %d OK, %d WARNING, %d CRITICAL\n' "$ok_count" "$warn_count" "$fail_count"
if ((fail_count > 0)); then
exit 1
fi
exit 0
+34
View File
@@ -0,0 +1,34 @@
#!/usr/bin/env bash
set -o errexit
set -o nounset
set -o pipefail
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
status=0
run_check() {
local name="$1"
shift
printf '\n== %s ==\n' "$name"
if "$@"; then
printf 'OK: %s completed\n' "$name"
else
printf 'CRITICAL: %s failed\n' "$name"
status=1
fi
}
run_check "Bash" "$ROOT_DIR/scripts/check-bash.sh"
run_check "Ansible" "$ROOT_DIR/scripts/check-ansible.sh"
run_check "Docs" "$ROOT_DIR/scripts/check-docs.sh"
printf '\n== Repository summary ==\n'
if ((status == 0)); then
printf 'OK: repository validation completed with no critical failures\n'
else
printf 'CRITICAL: one or more validation checks failed\n'
fi
exit "$status"