# journal-analyzer `journal-analyzer` is a read-only Python CLI for reviewing exported `journalctl` text logs. It summarizes systemd, service, and system-level journal findings that require operator review during Linux incident response, post-patching validation, restart troubleshooting, and change evidence collection. The tool analyzes exported journal text only. It does not call `journalctl` directly, does not modify host state, and does not claim root cause. ## Purpose - Summarize which units failed and which services appear repeatedly affected. - Surface dependency failures, restart loops, timeout patterns, OOM symptoms, disk/filesystem errors, TLS/certificate issues, authentication events, and network-related warnings. - Produce predictable text, Markdown, or JSON output that can be attached to an incident or change ticket. ## When To Use - After exporting a scoped `journalctl` window during incident response. - After package patching or service restarts when failed units or degraded services need review. - During Linux service troubleshooting when repeated restart or dependency messages need a quick grouped summary. - Before attaching journal evidence to an incident, problem, or change record. ## What It Does Not Do - It does not call `journalctl` directly in v1. - It does not modify the input log, systemd state, service state, or host configuration. - It does not read remote systems or live journal streams. - It does not query SIEM, ELK, Zabbix, APM, or ticketing systems. - It does not prove root cause or a service defect. - It does not classify every vendor-specific journal message. ## Supported Input Type - One exported local `journalctl` text file supplied with `--file`. - UTF-8 input is expected. Invalid byte sequences are replaced during read so review can continue. - Empty, missing, unreadable, or non-file paths are rejected with exit code `2`. Example export commands: ```bash journalctl --since "1 hour ago" > journal.log journalctl -u nginx --since today > nginx-journal.log journalctl -p warning..alert --since "24 hours ago" > warnings.log journalctl --no-pager --since "2026-05-11 10:00:00" > journal.log ``` ## Supported Event Categories Critical-oriented categories: - Failed unit or failed start findings. - Dependency failures. - Kernel panic and panic findings. - OOM killer and killed process findings. - Disk and filesystem issues such as `no space left on device`, read-only filesystem, filesystem errors, and I/O errors. - Service or application crash patterns such as `segfault`. - TLS and certificate failures. - Emergency mode findings. Warning-oriented categories: - Restart and repeated start request findings. - Timeout and timed out findings. - Connection refused and connection reset findings. - Permission denied and denied findings. - Authentication failure findings. - Availability, degraded, failed, and warning findings that still require review. The matching is practical and pattern-based. Default matching is already case-tolerant for common operational wording, and `--ignore-case` is available for explicit filter runs and predictable operator intent. The tool is intended for first-pass operational review, not for proving causality. ## Timestamp Support The analyzer attempts to parse common journal and syslog timestamp formats: - `May 11 10:15:30` - `2026-05-11 10:15:30` - `2026-05-11T10:15:30` - `2026-05-11 10:15:30.123456` - `2026-05-11 10:15:30,123` If a timestamp cannot be parsed: - the line is still analyzed - first seen / last seen remain `UNKNOWN` where needed - time-window filters keep the line by default rather than silently discarding it Syslog-style timestamps without a year use the current local year internally unless `--since` provides a year context. ## Service Filtering Use `--service SERVICE_NAME` to keep findings for a specific service, unit, or process name. Partial matches are allowed. Examples: ```bash python3 journal_analyzer.py --file examples/sample-journal.log --service nginx python3 journal_analyzer.py --file examples/sample-journal.log --service sshd ``` `--service nginx` matches practical variants such as `nginx`, `nginx.service`, and lines where the raw journal text includes `nginx`. ## Severity Filtering Use `--severity warning` or `--severity critical` to limit the displayed findings. Examples: ```bash python3 journal_analyzer.py --file examples/sample-journal.log --severity critical python3 journal_analyzer.py --file examples/sample-journal.log --severity warning ``` ## Severity Model Overall status is conservative: - `OK` - no journal findings detected. - `WARNING` - warning-level findings exist but no critical findings exist. - `CRITICAL` - one or more critical findings exist. Critical status is driven by failed units, dependency failures, OOM events, kernel panic findings, disk full or read-only filesystem symptoms, emergency mode, TLS/certificate failures, and I/O or filesystem errors. Warning status is driven by restart-related findings, timeout patterns, connection issues, permission denied events, authentication failures, degraded messages, and generic warning/failure entries that still require review. The report summarizes exported journal findings that require review. It does not claim root cause. ## Usage ```bash cd infra-run/scripts/python/journal-analyzer python3 journal_analyzer.py --file examples/sample-journal.log python3 journal_analyzer.py --file examples/sample-journal.log --format markdown python3 journal_analyzer.py --file examples/sample-journal.log --format markdown --output journal-report.md python3 journal_analyzer.py --file examples/sample-journal.log --format json python3 journal_analyzer.py --file examples/sample-journal.log --service sshd python3 journal_analyzer.py --file examples/sample-journal.log --service nginx python3 journal_analyzer.py --file examples/sample-journal.log --severity critical python3 journal_analyzer.py --file examples/sample-journal.log --top 10 python3 journal_analyzer.py --file examples/sample-journal.log --since "2026-05-11 10:00:00" python3 journal_analyzer.py --file examples/sample-journal.log --until "2026-05-11 12:00:00" python3 journal_analyzer.py --file examples/sample-journal.log --ignore-case ``` ## Output Formats - `text` - default terminal-oriented report. - `markdown` - incident or change ticket attachment format. - `json` - structured output for local automation. Use `--output ` to write the report to a separate file. Without `--output`, the report is printed to stdout. ## Exit Codes - `0` - OK, no journal findings. - `1` - Journal findings detected. - `2` - Invalid input, unreadable file, bad argument, output write failure, or runtime error. ## Example Text Output ```text Journal Analyzer ================ Overall status: CRITICAL Journal findings require review; logs alone do not prove root cause. [CRITICAL] nginx.service - failed_unit Pattern: failed to start Occurrences: 1 Unit: nginx.service Process: systemd PID: 1 First seen: May 11 10:16:11 Last seen: May 11 10:16:11 Samples: - May 11 10:16:11 web01 systemd[1]: Failed to start nginx.service - A high performance web server and a reverse proxy server. Operational Summary ------------------- Overall status: CRITICAL Total lines scanned: 17 Total findings: 13 Critical finding groups: 7 Warning finding groups: 5 Affected services/units count: 9 ``` ## Markdown Workflow Generate a Markdown report from an exported journal and attach it to the incident or change ticket as supporting evidence: ```bash python3 journal_analyzer.py \ --file examples/sample-journal.log \ --format markdown \ --output journal-report.md ``` Review the report before attaching it. Use it as a concise summary of exported journal findings, then correlate it with service status, monitoring, recent changes, package history, and runbook-specific post-checks. ## Operational Limitations - Pattern matching is intentionally simple and predictable. - A single line can match more than one finding when it contains more than one meaningful symptom, such as a TLS failure plus certificate expiry. - Default matching is already case-tolerant for practical journal review; `--ignore-case` remains available when you want to force case-insensitive operator searches. - Unit, process, and PID extraction are best-effort and may return `UNKNOWN`. - Time filtering is best-effort because lines without parseable timestamps are retained. - Large log files are read into memory; use scoped journal exports for very large review windows. - The tool does not inspect structured journal fields because v1 works on exported text logs. ## Safety Notes - The tool only reads the input journal export and optionally writes a separate report. - It does not require root privileges unless the chosen log path requires them. - Do not include secrets, private hostnames, customer identifiers, or unsanitized production details in portfolio examples. - Treat the output as triage evidence that requires operator review, not an automated remediation decision.