Files
portfolio/infra-run/scripts/python/journal-analyzer/examples/sample-journal-report.md
T

144 lines
4.3 KiB
Markdown
Raw Normal View History

2026-05-11 17:06:05 +00:00
# Journal Analyzer Report
- Overall status: `CRITICAL`
- Journal findings require review; logs alone do not prove root cause.
## Finding Groups
### [CRITICAL] backup-agent - tls_certificate
- Pattern: `certificate expired`
- Occurrences: `1`
- Unit: `UNKNOWN`
- Process: `backup-agent`
- PID: `777`
- First seen: `2026-05-11 10:18:10`
- Last seen: `2026-05-11 10:18:10`
- Samples:
- `2026-05-11 10:18:10 web01 backup-agent[777]: TLS handshake failed for backup endpoint: certificate expired on peer connection`
### [CRITICAL] backup-agent - tls_certificate
- Pattern: `TLS handshake failed`
- Occurrences: `1`
- Unit: `UNKNOWN`
- Process: `backup-agent`
- PID: `777`
- First seen: `2026-05-11 10:18:10`
- Last seen: `2026-05-11 10:18:10`
- Samples:
- `2026-05-11 10:18:10 web01 backup-agent[777]: TLS handshake failed for backup endpoint: certificate expired on peer connection`
### [CRITICAL] dockerd - disk_filesystem
- Pattern: `no space left on device`
- Occurrences: `1`
- Unit: `UNKNOWN`
- Process: `dockerd`
- PID: `1347`
- First seen: `2026-05-11 10:17:33`
- Last seen: `2026-05-11 10:17:33`
- Samples:
- `2026-05-11 10:17:33 web01 dockerd[1347]: Error response from daemon: write /var/lib/docker/tmp/GetImageBlob123456: no space left on device`
### [CRITICAL] java - oom
- Pattern: `Out of memory`
- Occurrences: `1`
- Unit: `UNKNOWN`
- Process: `java`
- PID: `UNKNOWN`
- First seen: `2026-05-11 10:17:02`
- Last seen: `2026-05-11 10:17:02`
- Samples:
- `2026-05-11 10:17:02 web01 kernel: Out of memory: Killed process 4421 (java) total-vm:2048000kB, anon-rss:1024000kB, file-rss:1024kB, shmem-rss:0kB`
### [CRITICAL] java - oom
- Pattern: `killed process`
- Occurrences: `1`
- Unit: `UNKNOWN`
- Process: `java`
- PID: `UNKNOWN`
- First seen: `2026-05-11 10:17:02`
- Last seen: `2026-05-11 10:17:02`
- Samples:
- `2026-05-11 10:17:02 web01 kernel: Out of memory: Killed process 4421 (java) total-vm:2048000kB, anon-rss:1024000kB, file-rss:1024kB, shmem-rss:0kB`
### [CRITICAL] kernel - disk_filesystem
- Pattern: `read-only file system`
- Occurrences: `1`
- Unit: `UNKNOWN`
- Process: `kernel`
- PID: `UNKNOWN`
- First seen: `2026-05-11 10:17:54`
- Last seen: `2026-05-11 10:17:54`
- Samples:
- `2026-05-11 10:17:54 web01 kernel: EXT4-fs error (device sda2): Remounting read-only file system`
### [CRITICAL] kernel - oom
- Pattern: `invoked oom-killer`
- Occurrences: `1`
- Unit: `UNKNOWN`
- Process: `kernel`
- PID: `UNKNOWN`
- First seen: `2026-05-11 10:17:01`
- Last seen: `2026-05-11 10:17:01`
- Samples:
- `2026-05-11 10:17:01 web01 kernel: invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0`
### [CRITICAL] nginx.service - dependency_failure
- Pattern: `dependency failed`
- Occurrences: `1`
- Unit: `nginx.service`
- Process: `systemd`
- PID: `1`
- First seen: `May 11 10:16:08`
- Last seen: `May 11 10:16:08`
- Samples:
- `May 11 10:16:08 web01 systemd[1]: Dependency failed for nginx.service.`
### [CRITICAL] nginx.service - failed_unit
- Pattern: `failed to start`
- Occurrences: `1`
- Unit: `nginx.service`
- Process: `systemd`
- PID: `1`
- First seen: `May 11 10:16:11`
- Last seen: `May 11 10:16:11`
- Samples:
- `May 11 10:16:11 web01 systemd[1]: Failed to start nginx.service - A high performance web server and a reverse proxy server.`
### [CRITICAL] nginx.service - failed_unit
- Pattern: `entered failed state`
- Occurrences: `1`
- Unit: `nginx.service`
- Process: `systemd`
- PID: `1`
- First seen: `May 11 10:16:12`
- Last seen: `May 11 10:16:12`
- Samples:
- `May 11 10:16:12 web01 systemd[1]: nginx.service: Unit entered failed state.`
## Operational Summary
- Overall status: `CRITICAL`
- Total lines scanned: `17`
- Total findings: `18`
- Critical finding groups: `11`
- Warning finding groups: `7`
- Affected services/units count: `9`
- Top affected services/units: nginx.service (5), sshd.service (3), kernel (2), java (2), backup-agent (2), sshd (1), dockerd (1), NetworkManager (1), systemd (1)
- Top finding categories: restart (3), oom (3), failed_unit (2), disk_filesystem (2), tls_certificate (2), authentication (1), timeout (1), dependency_failure (1), generic_failure (1), network (1)
- Failed unit findings: nginx.service (3)
- Restart findings: `3`
- OOM findings: `3`
- Filesystem/disk findings: `2`
- Timestamp coverage: parsed=`17`, unknown=`0`
- Filters used: service=`None`, severity=`None`, since=`None`, until=`None`