219 lines
8.7 KiB
Markdown
219 lines
8.7 KiB
Markdown
# jvm-log-analyzer
|
|
|
|
`jvm-log-analyzer` is a read-only Python CLI for reviewing local JVM and Java application logs. It summarizes common Java exceptions, stack trace fragments, JVM failure symptoms, database issues, network/TLS problems, HTTP 5xx entries, and repeated application warning/error patterns that require operator review.
|
|
|
|
The tool is intended for Linux infrastructure, SRE, and application support workflows where a collected log file needs a quick first-pass operational summary. It does not modify logs or system state.
|
|
|
|
## When To Use
|
|
|
|
- During incident response when a JVM application log needs a fast exception and symptom summary.
|
|
- During application support handoff when stack traces, HTTP 5xx entries, or database failures need to be attached as evidence.
|
|
- After a restart, deployment, certificate change, database incident, or capacity event when local log extracts are available.
|
|
- When predictable text, Markdown, or JSON output is useful for local review.
|
|
|
|
## What It Does
|
|
|
|
- Reads one local JVM or Java application log supplied with `--file`.
|
|
- Detects configured critical and warning JVM/application patterns.
|
|
- Extracts timestamps, log levels, thread names, logger/class names, exception types, raw samples, and short stack trace fragments where practical.
|
|
- Aggregates top finding groups, exception types, and operational symptoms.
|
|
- Produces text, Markdown, or JSON output.
|
|
|
|
## What It Does Not Do
|
|
|
|
- It does not read remote systems or live journal streams.
|
|
- It does not modify logs, services, application files, JVM flags, certificates, or database state.
|
|
- It does not query APM, ELK, SIEM, Zabbix, ticketing systems, or application APIs.
|
|
- It does not find root cause automatically.
|
|
- It does not prove an application defect.
|
|
- It does not classify every vendor-specific Java framework or application message.
|
|
|
|
## Supported Input Types
|
|
|
|
- Java / JVM application logs.
|
|
- Spring Boot style logs.
|
|
- Tomcat-style application logs.
|
|
- Generic application logs containing Java exceptions and stack traces.
|
|
|
|
UTF-8 text input is expected. Invalid byte sequences are replaced during read so review can continue. Empty, missing, unreadable, or non-file paths are rejected with exit code `2`.
|
|
|
|
## Supported JVM/Application Patterns
|
|
|
|
Critical patterns:
|
|
|
|
- `OutOfMemoryError`
|
|
- `Java heap space`
|
|
- `GC overhead limit exceeded`
|
|
- `StackOverflowError`
|
|
- `NoClassDefFoundError`
|
|
- `ClassNotFoundException`
|
|
- `ExceptionInInitializerError`
|
|
- `SSLHandshakeException`
|
|
- `CertificateExpiredException`
|
|
- `SQLException`
|
|
- `SQLRecoverableException`
|
|
- `CommunicationsException`
|
|
- `database unavailable`
|
|
- `connection pool exhausted`
|
|
- `HTTP 500`
|
|
- `HTTP 502`
|
|
- `HTTP 503`
|
|
- `HTTP 504`
|
|
- `FATAL`
|
|
|
|
Warning patterns:
|
|
|
|
- `NullPointerException`
|
|
- `IllegalArgumentException`
|
|
- `IllegalStateException`
|
|
- `SocketTimeoutException`
|
|
- `ConnectException`
|
|
- `TimeoutException`
|
|
- `connection refused`
|
|
- `connection reset`
|
|
- `Broken pipe`
|
|
- `WARN`
|
|
- `ERROR`
|
|
- `retrying`
|
|
- `slow query`
|
|
- `deadlock detected`
|
|
|
|
By default matching is case-sensitive. Use `--ignore-case` for case-insensitive matching across configured patterns.
|
|
|
|
## Stack Trace Handling
|
|
|
|
The scanner detects practical multiline Java stack traces using common starts such as:
|
|
|
|
- Fully qualified Java exception lines, such as `java.lang.NullPointerException`.
|
|
- `Exception in thread "main"`.
|
|
- `Caused by:`.
|
|
- Application exceptions ending in `Exception` or `Error`.
|
|
|
|
Following stack frames are grouped when they look like Java frames:
|
|
|
|
- Lines starting with whitespace followed by `at `.
|
|
- Lines starting with `Caused by:`.
|
|
- Lines containing `... N more`.
|
|
|
|
Stack traces are associated with the detected exception type where possible. Text and Markdown output include only short sample lines by default. Use `--include-stacktraces` to include capped multiline stack trace fragments.
|
|
|
|
## Timestamp Handling
|
|
|
|
The scanner attempts to parse:
|
|
|
|
- `2026-05-11 10:15:30`
|
|
- `2026-05-11T10:15:30`
|
|
- `2026-05-11 10:15:30,123`
|
|
- `2026-05-11 10:15:30.123`
|
|
- `May 11 10:15:30`
|
|
|
|
Timestamp parsing is best-effort. Lines with unparseable timestamps are still analyzed. When `--since` or `--until` is used, lines without parseable timestamps are retained by default so potentially important findings are not silently discarded.
|
|
|
|
## Severity Model
|
|
|
|
Overall status is conservative:
|
|
|
|
- `OK` - no JVM/application findings.
|
|
- `WARNING` - warning-level findings exist but no critical findings exist.
|
|
- `CRITICAL` - one or more critical findings exist.
|
|
|
|
Critical status is driven by JVM memory failures, fatal JVM symptoms, selected class loading errors, TLS/certificate failures, database unavailable or pool exhaustion symptoms, and HTTP 5xx volume at or above the configured threshold.
|
|
|
|
Warning status is driven by non-fatal exceptions, `WARN`/`ERROR` entries, timeout/retry patterns, connection refused/reset symptoms, slow query findings, and deadlock patterns.
|
|
|
|
HTTP 5xx findings are warnings until their total reaches `--http-critical-threshold`, which defaults to `5`. The report summarizes findings that require review; it does not claim root cause.
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
cd infra-run/scripts/python/jvm-log-analyzer
|
|
|
|
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log
|
|
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --format markdown
|
|
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --format markdown --output jvm-report.md
|
|
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --format json
|
|
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --top 10
|
|
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --max-samples 5
|
|
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --include-stacktraces
|
|
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --since "2026-05-11 10:00:00"
|
|
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --until "2026-05-11 12:00:00"
|
|
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --http-critical-threshold 2
|
|
```
|
|
|
|
## Output Formats
|
|
|
|
- `text` - default terminal-oriented report.
|
|
- `markdown` - incident or application support ticket attachment format.
|
|
- `json` - structured output for local automation.
|
|
|
|
Use `--output <path>` to write the rendered report to a separate file. Without `--output`, the report is printed to stdout. The tool rejects an output path that resolves to the input log file.
|
|
|
|
## Exit Codes
|
|
|
|
- `0` - OK, no JVM/application findings.
|
|
- `1` - JVM/application findings detected.
|
|
- `2` - Invalid input, unreadable file, bad argument, output write failure, or runtime error.
|
|
|
|
## Example Text Output
|
|
|
|
```text
|
|
JVM Log Analyzer
|
|
================
|
|
|
|
Overall status: CRITICAL
|
|
Findings require review; logs alone do not prove root cause.
|
|
|
|
[CRITICAL] OutOfMemoryError
|
|
Occurrences: 1
|
|
Symptom: jvm_memory
|
|
First seen: UNKNOWN
|
|
Last seen: UNKNOWN
|
|
Stack traces linked: 1
|
|
Samples:
|
|
- Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
|
|
|
|
Operational Summary
|
|
-------------------
|
|
Overall status: CRITICAL
|
|
Total lines scanned: 33
|
|
Total findings: 27
|
|
Total stack traces detected: 4
|
|
Critical finding groups: 11
|
|
Warning finding groups: 8
|
|
HTTP 5xx count: 3
|
|
Parsed timestamps count: 21
|
|
Unknown timestamps count: 12
|
|
```
|
|
|
|
## Markdown Workflow
|
|
|
|
Generate a Markdown report from a collected JVM application log and attach it to the incident or application support ticket as supporting evidence:
|
|
|
|
```bash
|
|
python3 jvm_log_analyzer.py \
|
|
--file examples/sample-jvm-app.log \
|
|
--format markdown \
|
|
--include-stacktraces \
|
|
--output jvm-report.md
|
|
```
|
|
|
|
Review the report before attaching it. A `WARNING` or `CRITICAL` result should be reviewed with application health checks, JVM memory telemetry, database status, certificate state, recent deployments, and the relevant application owner.
|
|
|
|
## Operational Limitations
|
|
|
|
- Pattern matching is intentionally simple and predictable.
|
|
- A single log line can match multiple findings, such as `ERROR`, `HTTP 503`, and a Java exception.
|
|
- Case-sensitive default matching can miss lowercase variants unless `--ignore-case` is used.
|
|
- Stack trace grouping is practical, not a complete Java parser.
|
|
- Timestamp parsing is best-effort; unparseable lines are retained during time filtering.
|
|
- HTTP 5xx counts are raw log counts, not request rates or customer impact.
|
|
- Large log files are read into memory; collect scoped extracts for very large incidents.
|
|
|
|
## Safety Notes
|
|
|
|
- The tool only reads the input log and optionally writes a separate report.
|
|
- The implementation uses the Python standard library only and does not require package installation.
|
|
- It does not require elevated privileges unless the chosen log path requires them.
|
|
- Do not include secrets, customer data, private hostnames, tokens, or unsanitized production details in portfolio examples.
|
|
- Treat operational findings as prompts that require review; the tool does not determine root cause automatically.
|