Files
portfolio/infra-run/scripts/python/jvm-log-analyzer
Mateusz Suski 8a7b7c5abc
lint / shell-yaml-ansible (push) Failing after 20s
Clean up Python log analysis documentation
2026-05-11 17:10:10 +00:00
..
2026-05-11 17:05:27 +00:00
2026-05-11 17:05:27 +00:00

jvm-log-analyzer

jvm-log-analyzer is a read-only Python CLI for reviewing local JVM and Java application logs. It summarizes common Java exceptions, stack trace fragments, JVM failure symptoms, database issues, network/TLS problems, HTTP 5xx entries, and repeated application warning/error patterns that require operator review.

The tool is intended for Linux infrastructure, SRE, and application support workflows where a collected log file needs a quick first-pass operational summary. It does not modify logs or system state.

When To Use

  • During incident response when a JVM application log needs a fast exception and symptom summary.
  • During application support handoff when stack traces, HTTP 5xx entries, or database failures need to be attached as evidence.
  • After a restart, deployment, certificate change, database incident, or capacity event when local log extracts are available.
  • When predictable text, Markdown, or JSON output is useful for local review.

What It Does

  • Reads one local JVM or Java application log supplied with --file.
  • Detects configured critical and warning JVM/application patterns.
  • Extracts timestamps, log levels, thread names, logger/class names, exception types, raw samples, and short stack trace fragments where practical.
  • Aggregates top finding groups, exception types, and operational symptoms.
  • Produces text, Markdown, or JSON output.

What It Does Not Do

  • It does not read remote systems or live journal streams.
  • It does not modify logs, services, application files, JVM flags, certificates, or database state.
  • It does not query APM, ELK, SIEM, Zabbix, ticketing systems, or application APIs.
  • It does not find root cause automatically.
  • It does not prove an application defect.
  • It does not classify every vendor-specific Java framework or application message.

Supported Input Types

  • Java / JVM application logs.
  • Spring Boot style logs.
  • Tomcat-style application logs.
  • Generic application logs containing Java exceptions and stack traces.

UTF-8 text input is expected. Invalid byte sequences are replaced during read so review can continue. Empty, missing, unreadable, or non-file paths are rejected with exit code 2.

Supported JVM/Application Patterns

Critical patterns:

  • OutOfMemoryError
  • Java heap space
  • GC overhead limit exceeded
  • StackOverflowError
  • NoClassDefFoundError
  • ClassNotFoundException
  • ExceptionInInitializerError
  • SSLHandshakeException
  • CertificateExpiredException
  • SQLException
  • SQLRecoverableException
  • CommunicationsException
  • database unavailable
  • connection pool exhausted
  • HTTP 500
  • HTTP 502
  • HTTP 503
  • HTTP 504
  • FATAL

Warning patterns:

  • NullPointerException
  • IllegalArgumentException
  • IllegalStateException
  • SocketTimeoutException
  • ConnectException
  • TimeoutException
  • connection refused
  • connection reset
  • Broken pipe
  • WARN
  • ERROR
  • retrying
  • slow query
  • deadlock detected

By default matching is case-sensitive. Use --ignore-case for case-insensitive matching across configured patterns.

Stack Trace Handling

The scanner detects practical multiline Java stack traces using common starts such as:

  • Fully qualified Java exception lines, such as java.lang.NullPointerException.
  • Exception in thread "main".
  • Caused by:.
  • Application exceptions ending in Exception or Error.

Following stack frames are grouped when they look like Java frames:

  • Lines starting with whitespace followed by at .
  • Lines starting with Caused by:.
  • Lines containing ... N more.

Stack traces are associated with the detected exception type where possible. Text and Markdown output include only short sample lines by default. Use --include-stacktraces to include capped multiline stack trace fragments.

Timestamp Handling

The scanner attempts to parse:

  • 2026-05-11 10:15:30
  • 2026-05-11T10:15:30
  • 2026-05-11 10:15:30,123
  • 2026-05-11 10:15:30.123
  • May 11 10:15:30

Timestamp parsing is best-effort. Lines with unparseable timestamps are still analyzed. When --since or --until is used, lines without parseable timestamps are retained by default so potentially important findings are not silently discarded.

Severity Model

Overall status is conservative:

  • OK - no JVM/application findings.
  • WARNING - warning-level findings exist but no critical findings exist.
  • CRITICAL - one or more critical findings exist.

Critical status is driven by JVM memory failures, fatal JVM symptoms, selected class loading errors, TLS/certificate failures, database unavailable or pool exhaustion symptoms, and HTTP 5xx volume at or above the configured threshold.

Warning status is driven by non-fatal exceptions, WARN/ERROR entries, timeout/retry patterns, connection refused/reset symptoms, slow query findings, and deadlock patterns.

HTTP 5xx findings are warnings until their total reaches --http-critical-threshold, which defaults to 5. The report summarizes findings that require review; it does not claim root cause.

Usage

cd infra-run/scripts/python/jvm-log-analyzer

python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --format markdown
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --format markdown --output jvm-report.md
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --format json
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --top 10
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --max-samples 5
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --include-stacktraces
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --since "2026-05-11 10:00:00"
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --until "2026-05-11 12:00:00"
python3 jvm_log_analyzer.py --file examples/sample-jvm-app.log --http-critical-threshold 2

Output Formats

  • text - default terminal-oriented report.
  • markdown - incident or application support ticket attachment format.
  • json - structured output for local automation.

Use --output <path> to write the rendered report to a separate file. Without --output, the report is printed to stdout. The tool rejects an output path that resolves to the input log file.

Exit Codes

  • 0 - OK, no JVM/application findings.
  • 1 - JVM/application findings detected.
  • 2 - Invalid input, unreadable file, bad argument, output write failure, or runtime error.

Example Text Output

JVM Log Analyzer
================

Overall status: CRITICAL
Findings require review; logs alone do not prove root cause.

[CRITICAL] OutOfMemoryError
Occurrences: 1
Symptom: jvm_memory
First seen: UNKNOWN
Last seen: UNKNOWN
Stack traces linked: 1
Samples:
  - Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Operational Summary
-------------------
Overall status: CRITICAL
Total lines scanned: 33
Total findings: 27
Total stack traces detected: 4
Critical finding groups: 11
Warning finding groups: 8
HTTP 5xx count: 3
Parsed timestamps count: 21
Unknown timestamps count: 12

Markdown Workflow

Generate a Markdown report from a collected JVM application log and attach it to the incident or application support ticket as supporting evidence:

python3 jvm_log_analyzer.py \
  --file examples/sample-jvm-app.log \
  --format markdown \
  --include-stacktraces \
  --output jvm-report.md

Review the report before attaching it. A WARNING or CRITICAL result should be reviewed with application health checks, JVM memory telemetry, database status, certificate state, recent deployments, and the relevant application owner.

Operational Limitations

  • Pattern matching is intentionally simple and predictable.
  • A single log line can match multiple findings, such as ERROR, HTTP 503, and a Java exception.
  • Case-sensitive default matching can miss lowercase variants unless --ignore-case is used.
  • Stack trace grouping is practical, not a complete Java parser.
  • Timestamp parsing is best-effort; unparseable lines are retained during time filtering.
  • HTTP 5xx counts are raw log counts, not request rates or customer impact.
  • Large log files are read into memory; collect scoped extracts for very large incidents.

Safety Notes

  • The tool only reads the input log and optionally writes a separate report.
  • The implementation uses the Python standard library only and does not require package installation.
  • It does not require elevated privileges unless the chosen log path requires them.
  • Do not include secrets, customer data, private hostnames, tokens, or unsanitized production details in portfolio examples.
  • Treat operational findings as prompts that require review; the tool does not determine root cause automatically.