Add incident log summary tool
This commit is contained in:
@@ -0,0 +1,158 @@
|
|||||||
|
# incident-log-summary
|
||||||
|
|
||||||
|
`incident-log-summary` is a read-only Python CLI for quick incident log review. It scans a local Linux system log or application log and groups configured operational patterns by severity, count, timestamps, and sample lines.
|
||||||
|
|
||||||
|
The tool is meant for first-pass triage and incident notes. It does not replace full log search, alert correlation, service-specific runbooks, or review by an operator who understands the affected platform.
|
||||||
|
|
||||||
|
## When To Use
|
||||||
|
|
||||||
|
- During incident response when a collected log file needs a fast pattern summary.
|
||||||
|
- Before attaching evidence to an incident, problem, or change ticket.
|
||||||
|
- When comparing whether a log contains obvious storage, memory, service, TLS, HTTP, or connectivity failures.
|
||||||
|
- When JSON output is useful for later local automation.
|
||||||
|
|
||||||
|
## What It Does Not Do
|
||||||
|
|
||||||
|
- It does not read remote systems.
|
||||||
|
- It does not modify logs or system state.
|
||||||
|
- It does not query ELK, Zabbix, SIEM, journald, or application APIs.
|
||||||
|
- It does not prove root cause.
|
||||||
|
- It does not classify every possible vendor or application error.
|
||||||
|
- It does not treat sanitized examples as production validation.
|
||||||
|
|
||||||
|
## Supported Input
|
||||||
|
|
||||||
|
- One local text log file provided with `--file`.
|
||||||
|
- UTF-8 input is expected. Invalid byte sequences are replaced during read so review can continue.
|
||||||
|
- Empty, missing, unreadable, or non-file paths are rejected with exit code `2`.
|
||||||
|
|
||||||
|
## Supported Patterns
|
||||||
|
|
||||||
|
Critical patterns:
|
||||||
|
|
||||||
|
- `CRITICAL`
|
||||||
|
- `FATAL`
|
||||||
|
- `panic`
|
||||||
|
- `kernel panic`
|
||||||
|
- `no space left on device`
|
||||||
|
- `out of memory`
|
||||||
|
- `killed process`
|
||||||
|
- `read-only file system`
|
||||||
|
- `segmentation fault`
|
||||||
|
- `segfault`
|
||||||
|
- `certificate expired`
|
||||||
|
- `TLS handshake failed`
|
||||||
|
- `SSLHandshakeException`
|
||||||
|
- `database unavailable`
|
||||||
|
- `HTTP 500`
|
||||||
|
- `HTTP 502`
|
||||||
|
- `HTTP 503`
|
||||||
|
- `HTTP 504`
|
||||||
|
|
||||||
|
Warning patterns:
|
||||||
|
|
||||||
|
- `ERROR`
|
||||||
|
- `failed`
|
||||||
|
- `failure`
|
||||||
|
- `timeout`
|
||||||
|
- `connection refused`
|
||||||
|
- `connection reset`
|
||||||
|
- `permission denied`
|
||||||
|
- `authentication failed`
|
||||||
|
- `denied`
|
||||||
|
- `unavailable`
|
||||||
|
- `service restart`
|
||||||
|
- `retrying`
|
||||||
|
|
||||||
|
By default matching is case-sensitive. Use `--ignore-case` for case-insensitive matching across all configured patterns.
|
||||||
|
|
||||||
|
## Timestamp Handling
|
||||||
|
|
||||||
|
The scanner attempts to parse:
|
||||||
|
|
||||||
|
- `2026-05-11 10:15:30`
|
||||||
|
- `2026-05-11T10:15:30`
|
||||||
|
- `May 11 10:15:30`
|
||||||
|
|
||||||
|
Timestamp parsing is best-effort. Lines with unparseable timestamps are still analyzed, and date filtering keeps those lines by default so potentially important findings are not silently discarded.
|
||||||
|
|
||||||
|
Syslog-style timestamps do not include a year. For filtering, the tool uses the year from `--since` when present, otherwise the current local year.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd infra-run/scripts/python/incident-log-summary
|
||||||
|
|
||||||
|
python3 incident_log_summary.py --file examples/system-messages.log
|
||||||
|
python3 incident_log_summary.py --file examples/app-error.log --format markdown --output incident-report.md
|
||||||
|
python3 incident_log_summary.py --file examples/app-error.log --format json
|
||||||
|
python3 incident_log_summary.py --file examples/app-error.log --top 20
|
||||||
|
python3 incident_log_summary.py --file examples/app-error.log --ignore-case
|
||||||
|
python3 incident_log_summary.py --file examples/app-error.log --since "2026-05-11 10:00:00"
|
||||||
|
python3 incident_log_summary.py --file examples/app-error.log --until "2026-05-11 12:00:00"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Output Formats
|
||||||
|
|
||||||
|
- `text` - default terminal-oriented report.
|
||||||
|
- `markdown` - incident or change ticket attachment format.
|
||||||
|
- `json` - structured output for local automation.
|
||||||
|
|
||||||
|
Use `--output <path>` to write the rendered report to a file. Without `--output`, the report is printed to stdout.
|
||||||
|
|
||||||
|
## Exit Codes
|
||||||
|
|
||||||
|
- `0` - OK, no findings.
|
||||||
|
- `1` - Operational findings detected.
|
||||||
|
- `2` - Invalid input, unreadable file, bad argument, or runtime error.
|
||||||
|
|
||||||
|
## Example Text Output
|
||||||
|
|
||||||
|
```text
|
||||||
|
Incident Log Summary
|
||||||
|
====================
|
||||||
|
|
||||||
|
[CRITICAL] no space left on device
|
||||||
|
Occurrences: 1
|
||||||
|
First seen: 2026-05-11 10:16:07
|
||||||
|
Last seen: 2026-05-11 10:16:07
|
||||||
|
Samples:
|
||||||
|
- May 11 10:16:07 ops-node-01 kernel: EXT4-fs warning: no space left on device while writing /var/log/messages
|
||||||
|
|
||||||
|
Operational Summary
|
||||||
|
-------------------
|
||||||
|
Total lines scanned: 7
|
||||||
|
Total findings: 7
|
||||||
|
Critical finding groups: 3
|
||||||
|
Warning finding groups: 4
|
||||||
|
Overall status: CRITICAL
|
||||||
|
```
|
||||||
|
|
||||||
|
## Markdown Workflow
|
||||||
|
|
||||||
|
Generate a markdown report from the collected log and attach it to the incident or change ticket as supporting evidence:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 incident_log_summary.py \
|
||||||
|
--file examples/app-error.log \
|
||||||
|
--format markdown \
|
||||||
|
--output incident-report.md
|
||||||
|
```
|
||||||
|
|
||||||
|
Review the report before attaching it. The output is evidence for triage; it is not a final root cause statement.
|
||||||
|
|
||||||
|
## Operational Limitations
|
||||||
|
|
||||||
|
- Pattern matching is intentionally simple and predictable.
|
||||||
|
- A single line can match multiple patterns, such as `ERROR`, `HTTP 503`, and `unavailable`.
|
||||||
|
- Case-sensitive default matching can miss lowercase variants unless `--ignore-case` is used.
|
||||||
|
- Syslog timestamps without a year are normalized with an inferred year.
|
||||||
|
- Date filters are best-effort because lines without parseable timestamps are retained.
|
||||||
|
- Large log files are read into memory; collect a scoped file or time-windowed extract for very large incidents.
|
||||||
|
|
||||||
|
## Safety Notes
|
||||||
|
|
||||||
|
- The tool only reads the input log and optionally writes a separate report.
|
||||||
|
- It does not require elevated privileges unless the chosen log path requires them.
|
||||||
|
- Do not include secrets, customer data, private hostnames, or unsanitized production details in portfolio examples.
|
||||||
|
- Treat findings as prompts for operator review, not automated remediation instructions.
|
||||||
@@ -0,0 +1,8 @@
|
|||||||
|
2026-05-11 09:48:12 app01 api[4150]: INFO request_id=7f3a status=200 path=/health
|
||||||
|
2026-05-11 10:01:03 app01 api[4150]: ERROR request_id=8b21 HTTP 500 path=/checkout duration_ms=942
|
||||||
|
2026-05-11 10:03:19 app01 api[4150]: WARNING request_id=8b22 database unavailable for payments cluster
|
||||||
|
2026-05-11 10:05:44 app01 api[4150]: ERROR request_id=8b25 timeout waiting for inventory service
|
||||||
|
2026-05-11 10:07:02 app01 api[4150]: ERROR request_id=8b29 connection refused connecting to redis-cache:6379
|
||||||
|
2026-05-11T10:11:33 app01 api[4150]: CRITICAL request_id=8b31 TLS handshake failed: certificate expired
|
||||||
|
2026-05-11 10:13:58 app01 api[4150]: ERROR request_id=8b44 HTTP 503 path=/checkout upstream unavailable
|
||||||
|
2026-05-11 12:10:01 app01 api[4150]: INFO request_id=9001 status=200 path=/health
|
||||||
@@ -0,0 +1,144 @@
|
|||||||
|
# Incident Log Summary
|
||||||
|
|
||||||
|
## CRITICAL: certificate expired
|
||||||
|
|
||||||
|
- Occurrences: 1
|
||||||
|
- First seen: 2026-05-11 10:11:33
|
||||||
|
- Last seen: 2026-05-11 10:11:33
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11T10:11:33 app01 api[4150]: CRITICAL request_id=8b31 TLS handshake failed: certificate expired
|
||||||
|
```
|
||||||
|
|
||||||
|
## CRITICAL: CRITICAL
|
||||||
|
|
||||||
|
- Occurrences: 1
|
||||||
|
- First seen: 2026-05-11 10:11:33
|
||||||
|
- Last seen: 2026-05-11 10:11:33
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11T10:11:33 app01 api[4150]: CRITICAL request_id=8b31 TLS handshake failed: certificate expired
|
||||||
|
```
|
||||||
|
|
||||||
|
## CRITICAL: database unavailable
|
||||||
|
|
||||||
|
- Occurrences: 1
|
||||||
|
- First seen: 2026-05-11 10:03:19
|
||||||
|
- Last seen: 2026-05-11 10:03:19
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11 10:03:19 app01 api[4150]: WARNING request_id=8b22 database unavailable for payments cluster
|
||||||
|
```
|
||||||
|
|
||||||
|
## CRITICAL: HTTP 500
|
||||||
|
|
||||||
|
- Occurrences: 1
|
||||||
|
- First seen: 2026-05-11 10:01:03
|
||||||
|
- Last seen: 2026-05-11 10:01:03
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11 10:01:03 app01 api[4150]: ERROR request_id=8b21 HTTP 500 path=/checkout duration_ms=942
|
||||||
|
```
|
||||||
|
|
||||||
|
## CRITICAL: HTTP 503
|
||||||
|
|
||||||
|
- Occurrences: 1
|
||||||
|
- First seen: 2026-05-11 10:13:58
|
||||||
|
- Last seen: 2026-05-11 10:13:58
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11 10:13:58 app01 api[4150]: ERROR request_id=8b44 HTTP 503 path=/checkout upstream unavailable
|
||||||
|
```
|
||||||
|
|
||||||
|
## CRITICAL: TLS handshake failed
|
||||||
|
|
||||||
|
- Occurrences: 1
|
||||||
|
- First seen: 2026-05-11 10:11:33
|
||||||
|
- Last seen: 2026-05-11 10:11:33
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11T10:11:33 app01 api[4150]: CRITICAL request_id=8b31 TLS handshake failed: certificate expired
|
||||||
|
```
|
||||||
|
|
||||||
|
## WARNING: ERROR
|
||||||
|
|
||||||
|
- Occurrences: 4
|
||||||
|
- First seen: 2026-05-11 10:01:03
|
||||||
|
- Last seen: 2026-05-11 10:13:58
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11 10:01:03 app01 api[4150]: ERROR request_id=8b21 HTTP 500 path=/checkout duration_ms=942
|
||||||
|
2026-05-11 10:05:44 app01 api[4150]: ERROR request_id=8b25 timeout waiting for inventory service
|
||||||
|
2026-05-11 10:07:02 app01 api[4150]: ERROR request_id=8b29 connection refused connecting to redis-cache:6379
|
||||||
|
```
|
||||||
|
|
||||||
|
## WARNING: unavailable
|
||||||
|
|
||||||
|
- Occurrences: 2
|
||||||
|
- First seen: 2026-05-11 10:03:19
|
||||||
|
- Last seen: 2026-05-11 10:13:58
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11 10:03:19 app01 api[4150]: WARNING request_id=8b22 database unavailable for payments cluster
|
||||||
|
2026-05-11 10:13:58 app01 api[4150]: ERROR request_id=8b44 HTTP 503 path=/checkout upstream unavailable
|
||||||
|
```
|
||||||
|
|
||||||
|
## WARNING: connection refused
|
||||||
|
|
||||||
|
- Occurrences: 1
|
||||||
|
- First seen: 2026-05-11 10:07:02
|
||||||
|
- Last seen: 2026-05-11 10:07:02
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11 10:07:02 app01 api[4150]: ERROR request_id=8b29 connection refused connecting to redis-cache:6379
|
||||||
|
```
|
||||||
|
|
||||||
|
## WARNING: failed
|
||||||
|
|
||||||
|
- Occurrences: 1
|
||||||
|
- First seen: 2026-05-11 10:11:33
|
||||||
|
- Last seen: 2026-05-11 10:11:33
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11T10:11:33 app01 api[4150]: CRITICAL request_id=8b31 TLS handshake failed: certificate expired
|
||||||
|
```
|
||||||
|
|
||||||
|
## WARNING: timeout
|
||||||
|
|
||||||
|
- Occurrences: 1
|
||||||
|
- First seen: 2026-05-11 10:05:44
|
||||||
|
- Last seen: 2026-05-11 10:05:44
|
||||||
|
|
||||||
|
Sample log lines:
|
||||||
|
|
||||||
|
```text
|
||||||
|
2026-05-11 10:05:44 app01 api[4150]: ERROR request_id=8b25 timeout waiting for inventory service
|
||||||
|
```
|
||||||
|
|
||||||
|
## Operational Summary
|
||||||
|
|
||||||
|
- Total lines scanned: 8
|
||||||
|
- Total findings: 15
|
||||||
|
- Critical finding groups: 6
|
||||||
|
- Warning finding groups: 5
|
||||||
|
- Overall status: CRITICAL
|
||||||
@@ -0,0 +1,7 @@
|
|||||||
|
May 11 09:57:01 ops-node-01 systemd[1]: Started Session 443 of user svc_backup.
|
||||||
|
May 11 10:02:14 ops-node-01 systemd[1]: failed to start nightly-report.service: Unit entered failed state.
|
||||||
|
May 11 10:04:22 ops-node-01 sudo[18442]: svc_backup : command not allowed ; permission denied
|
||||||
|
May 11 10:16:07 ops-node-01 kernel: EXT4-fs warning: no space left on device while writing /var/log/messages
|
||||||
|
May 11 10:21:45 ops-node-01 kernel: out of memory: killed process 2517 (java) total-vm:2048000kB
|
||||||
|
May 11 10:22:03 ops-node-01 systemd[1]: service restart scheduled for app-worker.service
|
||||||
|
May 11 10:30:31 ops-node-01 sshd[19210]: Accepted publickey for admin from 192.0.2.15 port 52210 ssh2
|
||||||
@@ -0,0 +1,448 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Summarize incident-oriented patterns in local log files."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from datetime import datetime
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
|
||||||
|
EXIT_OK = 0
|
||||||
|
EXIT_FINDINGS = 1
|
||||||
|
EXIT_INVALID = 2
|
||||||
|
|
||||||
|
UNKNOWN = "UNKNOWN"
|
||||||
|
SEVERITY_ORDER = {"CRITICAL": 0, "WARNING": 1}
|
||||||
|
|
||||||
|
CRITICAL_PATTERNS = [
|
||||||
|
"CRITICAL",
|
||||||
|
"FATAL",
|
||||||
|
"panic",
|
||||||
|
"kernel panic",
|
||||||
|
"no space left on device",
|
||||||
|
"out of memory",
|
||||||
|
"killed process",
|
||||||
|
"read-only file system",
|
||||||
|
"segmentation fault",
|
||||||
|
"segfault",
|
||||||
|
"certificate expired",
|
||||||
|
"TLS handshake failed",
|
||||||
|
"SSLHandshakeException",
|
||||||
|
"database unavailable",
|
||||||
|
"HTTP 500",
|
||||||
|
"HTTP 502",
|
||||||
|
"HTTP 503",
|
||||||
|
"HTTP 504",
|
||||||
|
]
|
||||||
|
|
||||||
|
WARNING_PATTERNS = [
|
||||||
|
"ERROR",
|
||||||
|
"failed",
|
||||||
|
"failure",
|
||||||
|
"timeout",
|
||||||
|
"connection refused",
|
||||||
|
"connection reset",
|
||||||
|
"permission denied",
|
||||||
|
"authentication failed",
|
||||||
|
"denied",
|
||||||
|
"unavailable",
|
||||||
|
"service restart",
|
||||||
|
"retrying",
|
||||||
|
]
|
||||||
|
|
||||||
|
ISO_TIMESTAMP_RE = re.compile(r"\b(\d{4}-\d{2}-\d{2})[ T](\d{2}:\d{2}:\d{2})\b")
|
||||||
|
SYSLOG_TIMESTAMP_RE = re.compile(r"^([A-Z][a-z]{2}\s+\d{1,2}\s+\d{2}:\d{2}:\d{2})\b")
|
||||||
|
|
||||||
|
|
||||||
|
def build_parser() -> argparse.ArgumentParser:
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="Summarize suspicious and critical patterns in a local log file."
|
||||||
|
)
|
||||||
|
parser.add_argument("--file", required=True, help="Local log file to analyze.")
|
||||||
|
parser.add_argument(
|
||||||
|
"--format",
|
||||||
|
choices=("text", "markdown", "json"),
|
||||||
|
default="text",
|
||||||
|
help="Report format. Default: text.",
|
||||||
|
)
|
||||||
|
parser.add_argument("--output", help="Write report to this path instead of stdout.")
|
||||||
|
parser.add_argument(
|
||||||
|
"--top",
|
||||||
|
type=positive_int,
|
||||||
|
help="Limit finding groups after severity and count sorting.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--ignore-case",
|
||||||
|
action="store_true",
|
||||||
|
help="Match all configured patterns case-insensitively.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--since",
|
||||||
|
type=parse_filter_timestamp,
|
||||||
|
help='Include lines at or after "YYYY-MM-DD HH:MM:SS".',
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--until",
|
||||||
|
type=parse_filter_timestamp,
|
||||||
|
help='Include lines at or before "YYYY-MM-DD HH:MM:SS".',
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--max-samples",
|
||||||
|
type=non_negative_int,
|
||||||
|
default=3,
|
||||||
|
help="Maximum sample lines per finding group. Default: 3.",
|
||||||
|
)
|
||||||
|
return parser
|
||||||
|
|
||||||
|
|
||||||
|
def positive_int(value: str) -> int:
|
||||||
|
try:
|
||||||
|
number = int(value)
|
||||||
|
except ValueError as exc:
|
||||||
|
raise argparse.ArgumentTypeError("must be a positive integer") from exc
|
||||||
|
if number <= 0:
|
||||||
|
raise argparse.ArgumentTypeError("must be a positive integer")
|
||||||
|
return number
|
||||||
|
|
||||||
|
|
||||||
|
def non_negative_int(value: str) -> int:
|
||||||
|
try:
|
||||||
|
number = int(value)
|
||||||
|
except ValueError as exc:
|
||||||
|
raise argparse.ArgumentTypeError("must be zero or a positive integer") from exc
|
||||||
|
if number < 0:
|
||||||
|
raise argparse.ArgumentTypeError("must be zero or a positive integer")
|
||||||
|
return number
|
||||||
|
|
||||||
|
|
||||||
|
def parse_filter_timestamp(value: str) -> datetime:
|
||||||
|
for fmt in ("%Y-%m-%d %H:%M:%S", "%Y-%m-%dT%H:%M:%S"):
|
||||||
|
try:
|
||||||
|
return datetime.strptime(value, fmt)
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
raise argparse.ArgumentTypeError(
|
||||||
|
'expected timestamp format "YYYY-MM-DD HH:MM:SS"'
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def compile_patterns(ignore_case: bool) -> list[dict[str, Any]]:
|
||||||
|
flags = re.IGNORECASE if ignore_case else 0
|
||||||
|
pattern_defs: list[dict[str, str]] = []
|
||||||
|
pattern_defs.extend(
|
||||||
|
{"pattern": pattern, "severity": "CRITICAL"} for pattern in CRITICAL_PATTERNS
|
||||||
|
)
|
||||||
|
pattern_defs.extend(
|
||||||
|
{"pattern": pattern, "severity": "WARNING"} for pattern in WARNING_PATTERNS
|
||||||
|
)
|
||||||
|
|
||||||
|
compiled = []
|
||||||
|
for item in pattern_defs:
|
||||||
|
compiled.append(
|
||||||
|
{
|
||||||
|
"pattern": item["pattern"],
|
||||||
|
"severity": item["severity"],
|
||||||
|
"regex": re.compile(re.escape(item["pattern"]), flags),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
return compiled
|
||||||
|
|
||||||
|
|
||||||
|
def parse_line_timestamp(line: str, syslog_year: int) -> tuple[datetime | None, str | None]:
|
||||||
|
iso_match = ISO_TIMESTAMP_RE.search(line)
|
||||||
|
if iso_match:
|
||||||
|
raw = f"{iso_match.group(1)} {iso_match.group(2)}"
|
||||||
|
try:
|
||||||
|
return datetime.strptime(raw, "%Y-%m-%d %H:%M:%S"), raw
|
||||||
|
except ValueError:
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
syslog_match = SYSLOG_TIMESTAMP_RE.search(line)
|
||||||
|
if syslog_match:
|
||||||
|
raw = syslog_match.group(1)
|
||||||
|
normalized = f"{syslog_year} {raw}"
|
||||||
|
try:
|
||||||
|
parsed = datetime.strptime(normalized, "%Y %b %d %H:%M:%S")
|
||||||
|
except ValueError:
|
||||||
|
return None, None
|
||||||
|
return parsed, parsed.strftime("%Y-%m-%d %H:%M:%S")
|
||||||
|
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
|
||||||
|
def line_in_time_window(
|
||||||
|
parsed_at: datetime | None, since: datetime | None, until: datetime | None
|
||||||
|
) -> bool:
|
||||||
|
if parsed_at is None:
|
||||||
|
return True
|
||||||
|
if since is not None and parsed_at < since:
|
||||||
|
return False
|
||||||
|
if until is not None and parsed_at > until:
|
||||||
|
return False
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def read_log_file(path: Path) -> list[str]:
|
||||||
|
if not path.exists():
|
||||||
|
raise OSError(f"file does not exist: {path}")
|
||||||
|
if not path.is_file():
|
||||||
|
raise OSError(f"path is not a regular file: {path}")
|
||||||
|
try:
|
||||||
|
text = path.read_text(encoding="utf-8", errors="replace")
|
||||||
|
except PermissionError as exc:
|
||||||
|
raise OSError(f"file is not readable: {path}") from exc
|
||||||
|
except OSError as exc:
|
||||||
|
raise OSError(f"unable to read file {path}: {exc}") from exc
|
||||||
|
if text == "":
|
||||||
|
raise ValueError(f"file is empty: {path}")
|
||||||
|
return text.splitlines()
|
||||||
|
|
||||||
|
|
||||||
|
def analyze_log(
|
||||||
|
lines: list[str],
|
||||||
|
patterns: list[dict[str, Any]],
|
||||||
|
since: datetime | None,
|
||||||
|
until: datetime | None,
|
||||||
|
max_samples: int,
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
syslog_year = since.year if since is not None else datetime.now().year
|
||||||
|
groups: dict[str, dict[str, Any]] = {}
|
||||||
|
|
||||||
|
for line in lines:
|
||||||
|
parsed_at, rendered_at = parse_line_timestamp(line, syslog_year)
|
||||||
|
if not line_in_time_window(parsed_at, since, until):
|
||||||
|
continue
|
||||||
|
|
||||||
|
for item in patterns:
|
||||||
|
if not item["regex"].search(line):
|
||||||
|
continue
|
||||||
|
|
||||||
|
key = f"{item['severity']}::{item['pattern']}"
|
||||||
|
group = groups.setdefault(
|
||||||
|
key,
|
||||||
|
{
|
||||||
|
"pattern": item["pattern"],
|
||||||
|
"severity": item["severity"],
|
||||||
|
"occurrences": 0,
|
||||||
|
"first_seen": None,
|
||||||
|
"last_seen": None,
|
||||||
|
"samples": [],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
group["occurrences"] += 1
|
||||||
|
|
||||||
|
if parsed_at is not None:
|
||||||
|
if group["first_seen"] is None or parsed_at < group["first_seen"][0]:
|
||||||
|
group["first_seen"] = (parsed_at, rendered_at)
|
||||||
|
if group["last_seen"] is None or parsed_at > group["last_seen"][0]:
|
||||||
|
group["last_seen"] = (parsed_at, rendered_at)
|
||||||
|
|
||||||
|
if len(group["samples"]) < max_samples:
|
||||||
|
group["samples"].append(line)
|
||||||
|
|
||||||
|
findings = sorted(
|
||||||
|
groups.values(),
|
||||||
|
key=lambda item: (
|
||||||
|
SEVERITY_ORDER[item["severity"]],
|
||||||
|
-item["occurrences"],
|
||||||
|
item["pattern"].lower(),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
rendered_findings = []
|
||||||
|
for group in findings:
|
||||||
|
rendered_findings.append(
|
||||||
|
{
|
||||||
|
"pattern": group["pattern"],
|
||||||
|
"severity": group["severity"],
|
||||||
|
"occurrences": group["occurrences"],
|
||||||
|
"first_seen": render_seen(group["first_seen"]),
|
||||||
|
"last_seen": render_seen(group["last_seen"]),
|
||||||
|
"samples": group["samples"],
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"total_lines_scanned": len(lines),
|
||||||
|
"findings": rendered_findings,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def render_seen(value: tuple[datetime, str | None] | None) -> str:
|
||||||
|
if value is None:
|
||||||
|
return UNKNOWN
|
||||||
|
return value[1] or value[0].strftime("%Y-%m-%d %H:%M:%S")
|
||||||
|
|
||||||
|
|
||||||
|
def apply_top_limit(report: dict[str, Any], top: int | None) -> dict[str, Any]:
|
||||||
|
if top is None:
|
||||||
|
return report
|
||||||
|
limited = dict(report)
|
||||||
|
limited["findings"] = report["findings"][:top]
|
||||||
|
return limited
|
||||||
|
|
||||||
|
|
||||||
|
def add_summary(report: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
findings = report["findings"]
|
||||||
|
critical_groups = sum(1 for item in findings if item["severity"] == "CRITICAL")
|
||||||
|
warning_groups = sum(1 for item in findings if item["severity"] == "WARNING")
|
||||||
|
total_findings = sum(item["occurrences"] for item in findings)
|
||||||
|
|
||||||
|
if critical_groups > 0:
|
||||||
|
status = "CRITICAL"
|
||||||
|
elif warning_groups > 0:
|
||||||
|
status = "WARNING"
|
||||||
|
else:
|
||||||
|
status = "OK"
|
||||||
|
|
||||||
|
enriched = dict(report)
|
||||||
|
enriched["summary"] = {
|
||||||
|
"total_lines_scanned": report["total_lines_scanned"],
|
||||||
|
"total_findings": total_findings,
|
||||||
|
"critical_finding_groups": critical_groups,
|
||||||
|
"warning_finding_groups": warning_groups,
|
||||||
|
"overall_status": status,
|
||||||
|
}
|
||||||
|
return enriched
|
||||||
|
|
||||||
|
|
||||||
|
def render_text(report: dict[str, Any]) -> str:
|
||||||
|
lines = ["Incident Log Summary", "====================", ""]
|
||||||
|
if not report["findings"]:
|
||||||
|
lines.append("No configured incident patterns were detected.")
|
||||||
|
else:
|
||||||
|
for finding in report["findings"]:
|
||||||
|
lines.extend(
|
||||||
|
[
|
||||||
|
f"[{finding['severity']}] {finding['pattern']}",
|
||||||
|
f"Occurrences: {finding['occurrences']}",
|
||||||
|
f"First seen: {finding['first_seen']}",
|
||||||
|
f"Last seen: {finding['last_seen']}",
|
||||||
|
"Samples:",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if finding["samples"]:
|
||||||
|
lines.extend(f" - {sample}" for sample in finding["samples"])
|
||||||
|
else:
|
||||||
|
lines.append(" - No samples retained")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
lines.extend(render_text_summary(report["summary"]))
|
||||||
|
return "\n".join(lines) + "\n"
|
||||||
|
|
||||||
|
|
||||||
|
def render_text_summary(summary: dict[str, Any]) -> list[str]:
|
||||||
|
return [
|
||||||
|
"Operational Summary",
|
||||||
|
"-------------------",
|
||||||
|
f"Total lines scanned: {summary['total_lines_scanned']}",
|
||||||
|
f"Total findings: {summary['total_findings']}",
|
||||||
|
f"Critical finding groups: {summary['critical_finding_groups']}",
|
||||||
|
f"Warning finding groups: {summary['warning_finding_groups']}",
|
||||||
|
f"Overall status: {summary['overall_status']}",
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def render_markdown(report: dict[str, Any]) -> str:
|
||||||
|
lines = ["# Incident Log Summary", ""]
|
||||||
|
if not report["findings"]:
|
||||||
|
lines.extend(["No configured incident patterns were detected.", ""])
|
||||||
|
else:
|
||||||
|
for finding in report["findings"]:
|
||||||
|
lines.extend(
|
||||||
|
[
|
||||||
|
f"## {finding['severity']}: {finding['pattern']}",
|
||||||
|
"",
|
||||||
|
f"- Occurrences: {finding['occurrences']}",
|
||||||
|
f"- First seen: {finding['first_seen']}",
|
||||||
|
f"- Last seen: {finding['last_seen']}",
|
||||||
|
"",
|
||||||
|
"Sample log lines:",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if finding["samples"]:
|
||||||
|
lines.append("```text")
|
||||||
|
lines.extend(finding["samples"])
|
||||||
|
lines.append("```")
|
||||||
|
else:
|
||||||
|
lines.append("_No samples retained._")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
summary = report["summary"]
|
||||||
|
lines.extend(
|
||||||
|
[
|
||||||
|
"## Operational Summary",
|
||||||
|
"",
|
||||||
|
f"- Total lines scanned: {summary['total_lines_scanned']}",
|
||||||
|
f"- Total findings: {summary['total_findings']}",
|
||||||
|
f"- Critical finding groups: {summary['critical_finding_groups']}",
|
||||||
|
f"- Warning finding groups: {summary['warning_finding_groups']}",
|
||||||
|
f"- Overall status: {summary['overall_status']}",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def render_json(report: dict[str, Any]) -> str:
|
||||||
|
return json.dumps(report, indent=2, sort_keys=True) + "\n"
|
||||||
|
|
||||||
|
|
||||||
|
def write_report(output_path: str | None, content: str) -> None:
|
||||||
|
if output_path is None:
|
||||||
|
sys.stdout.write(content)
|
||||||
|
return
|
||||||
|
|
||||||
|
path = Path(output_path)
|
||||||
|
try:
|
||||||
|
path.write_text(content, encoding="utf-8")
|
||||||
|
except OSError as exc:
|
||||||
|
raise OSError(f"unable to write output {path}: {exc}") from exc
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
parser = build_parser()
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
if args.since is not None and args.until is not None and args.since > args.until:
|
||||||
|
parser.error("--since must be earlier than or equal to --until")
|
||||||
|
|
||||||
|
try:
|
||||||
|
lines = read_log_file(Path(args.file))
|
||||||
|
report = analyze_log(
|
||||||
|
lines=lines,
|
||||||
|
patterns=compile_patterns(args.ignore_case),
|
||||||
|
since=args.since,
|
||||||
|
until=args.until,
|
||||||
|
max_samples=args.max_samples,
|
||||||
|
)
|
||||||
|
report = add_summary(apply_top_limit(report, args.top))
|
||||||
|
|
||||||
|
if args.format == "text":
|
||||||
|
content = render_text(report)
|
||||||
|
elif args.format == "markdown":
|
||||||
|
content = render_markdown(report)
|
||||||
|
else:
|
||||||
|
content = render_json(report)
|
||||||
|
|
||||||
|
write_report(args.output, content)
|
||||||
|
except (OSError, ValueError) as exc:
|
||||||
|
print(f"CRITICAL: {exc}", file=sys.stderr)
|
||||||
|
return EXIT_INVALID
|
||||||
|
except RuntimeError as exc:
|
||||||
|
print(f"CRITICAL: runtime error: {exc}", file=sys.stderr)
|
||||||
|
return EXIT_INVALID
|
||||||
|
|
||||||
|
if report["summary"]["overall_status"] == "OK":
|
||||||
|
return EXIT_OK
|
||||||
|
return EXIT_FINDINGS
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
Reference in New Issue
Block a user