Add log diff checker tool
This commit is contained in:
@@ -0,0 +1,163 @@
|
||||
# log-diff-checker
|
||||
|
||||
`log-diff-checker` is a read-only Python CLI for comparing configured operational log patterns before and after a change. It is intended to help an infrastructure engineer decide whether a patch, deployment, configuration change, or service restart introduced new log risk or reduced existing noise.
|
||||
|
||||
The tool compares local pre-change and post-change log extracts. It does not modify input logs or system state.
|
||||
|
||||
## When To Use
|
||||
|
||||
- After a planned change when pre-check and post-check log extracts are available.
|
||||
- During change validation when the question is whether errors increased, disappeared, or stayed flat.
|
||||
- Before attaching log evidence to a change, incident, or problem ticket.
|
||||
- When predictable text, Markdown, or JSON output is useful for local review.
|
||||
|
||||
## What It Does
|
||||
|
||||
- Reads two local text log files supplied with `--before` and `--after`.
|
||||
- Scans both files for configured critical and warning patterns.
|
||||
- Compares before and after counts for each detected pattern.
|
||||
- Classifies patterns as `NEW`, `INCREASED`, `DECREASED`, `RESOLVED`, or `UNCHANGED`.
|
||||
- Sets an overall status of `OK`, `WARNING`, or `CRITICAL`.
|
||||
- Includes sample log lines from the side that best explains the change.
|
||||
|
||||
## What It Does Not Do
|
||||
|
||||
- It does not read remote systems.
|
||||
- It does not modify logs, services, or host state.
|
||||
- It does not query ELK, Zabbix, SIEM, journald, or application APIs.
|
||||
- It does not prove root cause or change safety.
|
||||
- It does not replace service-specific post-change checks.
|
||||
- It does not classify every possible vendor or application error.
|
||||
|
||||
## Supported Input
|
||||
|
||||
- Two local text log files:
|
||||
- `--before` for the pre-change log extract.
|
||||
- `--after` for the post-change log extract.
|
||||
- UTF-8 input is expected. Invalid byte sequences are replaced during read so review can continue.
|
||||
- Empty, missing, unreadable, or non-file paths are rejected with exit code `2`.
|
||||
|
||||
## Supported Patterns
|
||||
|
||||
Critical patterns:
|
||||
|
||||
- `CRITICAL`
|
||||
- `FATAL`
|
||||
- `panic`
|
||||
- `kernel panic`
|
||||
- `no space left on device`
|
||||
- `out of memory`
|
||||
- `killed process`
|
||||
- `read-only file system`
|
||||
- `segmentation fault`
|
||||
- `segfault`
|
||||
- `certificate expired`
|
||||
- `TLS handshake failed`
|
||||
- `SSLHandshakeException`
|
||||
- `database unavailable`
|
||||
- `HTTP 500`
|
||||
- `HTTP 502`
|
||||
- `HTTP 503`
|
||||
- `HTTP 504`
|
||||
|
||||
Warning patterns:
|
||||
|
||||
- `ERROR`
|
||||
- `failed`
|
||||
- `failure`
|
||||
- `timeout`
|
||||
- `connection refused`
|
||||
- `connection reset`
|
||||
- `permission denied`
|
||||
- `authentication failed`
|
||||
- `denied`
|
||||
- `unavailable`
|
||||
- `service restart`
|
||||
- `retrying`
|
||||
|
||||
By default matching is case-sensitive. Use `--ignore-case` for case-insensitive matching across all configured patterns.
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
cd infra-run/scripts/python/log-diff-checker
|
||||
|
||||
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log
|
||||
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --format markdown
|
||||
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --format markdown --output change-log-diff.md
|
||||
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --format json
|
||||
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --ignore-case
|
||||
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --top 20
|
||||
python3 log_diff_checker.py --before examples/pre-change.log --after examples/post-change.log --max-samples 5
|
||||
```
|
||||
|
||||
## Output Formats
|
||||
|
||||
- `text` - default terminal-oriented report.
|
||||
- `markdown` - change or incident ticket attachment format.
|
||||
- `json` - structured output for local automation.
|
||||
|
||||
Use `--output <path>` to write the rendered report to a separate file. Without `--output`, the report is printed to stdout. The tool rejects an output path that resolves to either input log file.
|
||||
|
||||
## Exit Codes
|
||||
|
||||
- `0` - OK, no new or increased findings.
|
||||
- `1` - New or increased findings detected.
|
||||
- `2` - Invalid input, unreadable file, bad argument, output write failure, or runtime error.
|
||||
|
||||
## Example Text Output
|
||||
|
||||
```text
|
||||
Log Diff Checker
|
||||
================
|
||||
|
||||
[CRITICAL] CRITICAL - NEW
|
||||
Before count: 0
|
||||
After count: 1
|
||||
Delta: +1
|
||||
Sample source: after
|
||||
Samples:
|
||||
- 2026-05-11 10:14:31 app01 inventory-api[2294]: CRITICAL database unavailable while opening checkout connection
|
||||
|
||||
Operational Summary
|
||||
-------------------
|
||||
Total lines scanned before: 7
|
||||
Total lines scanned after: 8
|
||||
Total unique patterns compared: 9
|
||||
New findings count: 3
|
||||
Increased findings count: 3
|
||||
Decreased findings count: 0
|
||||
Resolved findings count: 2
|
||||
Unchanged findings count: 1
|
||||
Overall status: CRITICAL
|
||||
```
|
||||
|
||||
## Markdown Workflow
|
||||
|
||||
Generate a Markdown report from collected pre-change and post-change logs, review it, and attach it to the change ticket as supporting evidence:
|
||||
|
||||
```bash
|
||||
python3 log_diff_checker.py \
|
||||
--before examples/pre-change.log \
|
||||
--after examples/post-change.log \
|
||||
--format markdown \
|
||||
--output change-log-diff.md
|
||||
```
|
||||
|
||||
Use the report as a log perspective on the change. A `CRITICAL` or `WARNING` result should be reviewed with service health checks, monitoring, rollback criteria, and the relevant application owner.
|
||||
|
||||
## Operational Limitations
|
||||
|
||||
- Pattern matching is intentionally simple and predictable.
|
||||
- A single line can match multiple patterns, such as `CRITICAL`, `database unavailable`, and `unavailable`.
|
||||
- Case-sensitive default matching can miss lowercase variants unless `--ignore-case` is used.
|
||||
- The tool compares counts, not rates, time windows, or request volume.
|
||||
- Large log files are read into memory; collect scoped extracts for very large incidents.
|
||||
- `--top` limits displayed findings only. The operational summary still reflects all compared patterns.
|
||||
|
||||
## Safety Notes
|
||||
|
||||
- The tool only reads the input logs and optionally writes a separate report.
|
||||
- It does not require elevated privileges unless the chosen log path requires them.
|
||||
- Do not include secrets, customer data, private hostnames, or unsanitized production details in portfolio examples.
|
||||
- Treat findings as prompts for operator review, not automated remediation instructions.
|
||||
Reference in New Issue
Block a user