This commit is contained in:
@@ -7,13 +7,15 @@ Small, practical Bash scripts for Linux operations checks and incident triage. T
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A["bash"] --> B["os-healthcheck"]
|
||||
A --> C["disk-full"]
|
||||
A --> D["veritas"]
|
||||
A --> E["gpfs"]
|
||||
A --> C["incident-checks"]
|
||||
A --> D["disk-full"]
|
||||
A --> E["veritas"]
|
||||
A --> F["gpfs"]
|
||||
B --> B1["Host diagnostics"]
|
||||
C --> C1["Incident workflow"]
|
||||
D --> D1["VxVM and VCS change flow"]
|
||||
E --> E1["Spectrum Scale expansion flow"]
|
||||
C --> C1["Standalone triage checks"]
|
||||
D --> D1["Incident workflow"]
|
||||
E --> E1["VxVM and VCS change flow"]
|
||||
F --> F1["Spectrum Scale expansion flow"]
|
||||
```
|
||||
|
||||
## Scripts
|
||||
@@ -23,6 +25,7 @@ flowchart TD
|
||||
- `os-healthcheck/service_check.sh` - critical service status check.
|
||||
- `os-healthcheck/system_report.sh` - writes a timestamped system report to `/tmp`.
|
||||
- `os-healthcheck/network_troubleshoot.sh` - local and optional remote network diagnostics.
|
||||
- `incident-checks/` - standalone read-only incident checks for CPU, memory/OOM, services, SSH failures, TLS certificates, DNS, NTP, filesystems, inodes, and JVM diagnostics.
|
||||
|
||||
## Usage
|
||||
|
||||
@@ -37,6 +40,12 @@ cd infra-run/scripts/bash/os-healthcheck
|
||||
./system_report.sh
|
||||
./network_troubleshoot.sh
|
||||
./network_troubleshoot.sh google.com
|
||||
|
||||
cd ../incident-checks
|
||||
./check_high_cpu.sh
|
||||
./check_high_memory_oom.sh --since "24 hours ago"
|
||||
./check_service_restart_loop.sh --service sshd
|
||||
./check_certificate_expiry.sh --host example.com
|
||||
```
|
||||
|
||||
## Standards
|
||||
|
||||
Reference in New Issue
Block a user