Files
Mateusz Suski 8a7b7c5abc
lint / shell-yaml-ansible (push) Failing after 20s
Clean up Python log analysis documentation
2026-05-11 17:10:10 +00:00

6.1 KiB

log-diff-checker

log-diff-checker is a read-only Python CLI for comparing configured operational log patterns before and after a change. It is intended to help an infrastructure engineer decide whether a patch, deployment, configuration change, or service restart introduced new log risk or reduced existing noise.

The tool compares local pre-change and post-change log extracts. It does not modify input logs or system state.

When To Use

  • After a planned change when pre-check and post-check log extracts are available.
  • During change validation when the question is whether errors increased, disappeared, or stayed flat.
  • Before attaching log evidence to a change, incident, or problem ticket.
  • When predictable text, Markdown, or JSON output is useful for local review.

What It Does

  • Reads two local text log files supplied with --before and --after.
  • Scans both files for configured critical and warning patterns.
  • Compares before and after counts for each detected pattern.
  • Classifies patterns as NEW, INCREASED, DECREASED, RESOLVED, or UNCHANGED.
  • Sets an overall status of OK, WARNING, or CRITICAL.
  • Includes sample log lines from the side that best explains the change.

What It Does Not Do

  • It does not read remote systems.
  • It does not modify logs, services, or host state.
  • It does not query ELK, Zabbix, SIEM, journald, or application APIs.
  • It does not prove root cause or change safety.
  • It does not replace service-specific post-change checks.
  • It does not classify every possible vendor or application error.

Supported Input

  • Two local text log files:
    • --before for the pre-change log extract.
    • --after for the post-change log extract.
  • UTF-8 input is expected. Invalid byte sequences are replaced during read so review can continue.
  • Empty, missing, unreadable, or non-file paths are rejected with exit code 2.

Supported Patterns

Critical patterns:

  • CRITICAL
  • FATAL
  • panic
  • kernel panic
  • no space left on device
  • out of memory
  • killed process
  • read-only file system
  • segmentation fault
  • segfault
  • certificate expired
  • TLS handshake failed
  • SSLHandshakeException
  • database unavailable
  • HTTP 500
  • HTTP 502
  • HTTP 503
  • HTTP 504

Warning patterns:

  • ERROR
  • failed
  • failure
  • timeout
  • connection refused
  • connection reset
  • permission denied
  • authentication failed
  • denied
  • unavailable
  • service restart
  • retrying

By default matching is case-sensitive. Use --ignore-case for case-insensitive matching across all configured patterns.

Usage

cd infra-run/scripts/python/log-diff-checker

python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --format markdown
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --format markdown --output change-log-diff.md
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --format json
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --ignore-case
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --top 20
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --max-samples 5

Output Formats

  • text - default terminal-oriented report.
  • markdown - change or incident ticket attachment format.
  • json - structured output for local automation.

Use --output <path> to write the rendered report to a separate file. Without --output, the report is printed to stdout. The tool rejects an output path that resolves to either input log file.

Exit Codes

  • 0 - OK, no new or increased findings.
  • 1 - New or increased findings detected.
  • 2 - Invalid input, unreadable file, bad argument, output write failure, or runtime error.

Example Text Output

Log Diff Checker
================

[CRITICAL] CRITICAL - NEW
Before count: 0
After count: 1
Delta: +1
Sample source: after
Samples:
  - 2026-05-11 10:14:31 app01 inventory-api[2294]: CRITICAL database unavailable while opening checkout connection

Operational Summary
-------------------
Total lines scanned before: 7
Total lines scanned after: 8
Total unique patterns compared: 9
New findings count: 3
Increased findings count: 3
Decreased findings count: 0
Resolved findings count: 2
Unchanged findings count: 1
Overall status: CRITICAL

Markdown Workflow

Generate a Markdown report from collected pre-change and post-change logs, review it, and attach it to the change ticket as supporting evidence:

python3 log_diff_checker.py \
  --before examples/pre-change.log \
  --after examples/post-change.log \
  --format markdown \
  --output change-log-diff.md

Use the report as a log perspective on the change. A CRITICAL or WARNING result should be reviewed with service health checks, monitoring, rollback criteria, and the relevant application owner.

Operational Limitations

  • Pattern matching is intentionally simple and predictable.
  • A single line can match multiple patterns, such as CRITICAL, database unavailable, and unavailable.
  • Case-sensitive default matching can miss lowercase variants unless --ignore-case is used.
  • The tool compares counts, not rates, time windows, or request volume.
  • Large log files are read into memory; collect scoped extracts for very large incidents.
  • --top limits displayed findings only. The operational summary still reflects all compared patterns.

Safety Notes

  • The tool only reads the input logs and optionally writes a separate report.
  • The implementation uses the Python standard library only and does not require package installation.
  • It does not require elevated privileges unless the chosen log path requires them.
  • Do not include secrets, customer data, private hostnames, or unsanitized production details in portfolio examples.
  • Treat operational findings as prompts that require review; the tool does not determine root cause automatically.