portfolio/infra-run/README.md

# infra-run

`infra-run` is a sanitized infrastructure operations project. It contains Bash, Ansible, Python, and documentation examples based on Linux administration, incident response, storage operations, hardening, prechecks, postchecks, and controlled change workflows.

The goal is to show operational judgment, not to ship a universal automation product.

## Current Contents

### Bash Operational Scripts

- [scripts/bash/os-healthcheck](./scripts/bash/os-healthcheck/) - general Linux health, service, disk, network, and report scripts.
- [scripts/bash/incident-checks](./scripts/bash/incident-checks/) - standalone read-only incident checks for CPU, memory/OOM, SSH failures, TLS expiry, DNS, NTP, filesystems, inodes, services, and JVM diagnostics.
- [scripts/bash/disk-full](./scripts/bash/disk-full/) - disk-full triage and cleanup review workflow.
- [scripts/bash/veritas](./scripts/bash/veritas/) - Veritas VxVM/VCS storage expansion workflow examples.
- [scripts/bash/gpfs](./scripts/bash/gpfs/) - GPFS / IBM Spectrum Scale expansion workflow examples.

### Python Log And Reporting Tools

- [scripts/python](./scripts/python/) - read-only Python operational helpers using the standard library only.
- [scripts/python/incident-log-summary](./scripts/python/incident-log-summary/) - read-only Python log summary helper for incident pattern review.
- [scripts/python/log-diff-checker](./scripts/python/log-diff-checker/) - read-only Python before/after log comparison helper for change review.
- [scripts/python/auth-log-audit](./scripts/python/auth-log-audit/) - read-only Python authentication log audit helper for SSH, sudo, su, and PAM review.
- [scripts/python/jvm-log-analyzer](./scripts/python/jvm-log-analyzer/) - read-only Python JVM and Java application log analyzer for exception, stack trace, HTTP 5xx, database, and TLS review.
- [scripts/python/journal-analyzer](./scripts/python/journal-analyzer/) - read-only Python exported journal analyzer for failed units, restart patterns, OOM events, and service warnings.
- [scripts/python/known-error-matcher](./scripts/python/known-error-matcher/) - read-only Python matcher for local logs and JSON known-error catalogs with runbook references.

### Ansible Automation

- [ansible](./ansible/) - selected baseline hardening examples for RHEL-like Linux, Debian/Ubuntu, and AIX.

### Runbooks And Documentation

- [examples](./examples/) - sanitized sample command outputs and incident notes.

## Documentation

- [docs/operations-cheatsheet.md](./docs/operations-cheatsheet.md) - production operations quick reference covering Linux/Unix triage, text processing, incident workflows, networking, storage, AIX, SSL/TLS, automation safety, Ansible execution, observability, and operational habits.

## What This Is

- A portfolio project for Linux and infrastructure operations roles.
- A set of readable examples showing precheck, dry-run, execution guardrails, postcheck, and reporting patterns.
- A place to demonstrate Bash, Ansible, storage workflow, and troubleshooting habits with sanitized inputs.

## What This Is Not

- Not intended for direct live use.
- Not a complete CIS benchmark implementation.
- Not a replacement for site-specific change procedures.
- Not tested against live Veritas, GPFS, or AIX systems in this repository.
- Not safe to run blindly on servers without review.

## Currently Usable

- Bash syntax can be checked locally.
- Shell scripts can be reviewed and partially exercised on a Linux workstation when platform commands are available or mocked.
- Disk-full read-only scripts can be run against local paths for basic behavior checks.
- Python log analysis examples can be run against sanitized sample logs under each tool directory.
- Ansible YAML and role structure can be linted locally.

## Running Safely

- Start with the relevant README or runbook before executing a script.
- Prefer read-only discovery scripts before remediation scripts.
- Use dry-run mode unless a script explicitly documents safe local behavior.
- Only use `--execute` after reviewing inputs, affected systems, rollback options, and post-checks.
- For Ansible, start with `--check --diff` against a lab inventory.

## Lab-Safe Examples

- Veritas and GPFS scripts default to dry-run behavior where they plan destructive or platform-changing operations.
- Ansible hardening roles are examples of selected controls and need adaptation before use.
- Sample outputs under [examples](./examples/) are fake and sanitized.

## Tested

See [TESTED.md](./TESTED.md) for current validation status.

Short version:

- Shell scripts were reviewed for dry-run behavior and obvious quoting issues.
- YAML and Ansible files are intended for local linting.
- Veritas, GPFS, and AIX behavior was not validated against real systems here.

## Basic Validation

From the repository root:

```bash
./scripts/validate-repo.sh
```

Focused checks are available in `scripts/check-bash.sh`, `scripts/check-ansible.sh`, `scripts/check-python.sh`, and `scripts/check-docs.sh`. If `ansible-lint` reports collection-related issues, install the collections listed in [ansible/collections/requirements.yml](./ansible/collections/requirements.yml) and rerun it. Treat lint as a starting point; platform testing still requires actual target systems.

## Supporting Notes

- [SOURCE.md](./SOURCE.md) explains why this project exists and what experience shaped it.
- [TESTED.md](./TESTED.md) lists what was checked locally and what was not.
- [KNOWN_LIMITATIONS.md](./KNOWN_LIMITATIONS.md) documents technical limits and operational cautions.
- [ROADMAP.md](./ROADMAP.md) tracks planned additions without presenting them as completed work.
- [../AGENTS.md](../AGENTS.md) and [../docs/codex](../docs/codex/) document repository working rules and review expectations.