35e6b139fc
ci / validate (push) Failing after 1m8s
Rework portfolio around Linux operations, Zabbix monitoring, migration validation, and ELK/Grafana log observability. Add AAP-style LVM resize workflow, Zabbix server/proxy/agent automation assets, Linux/AIX monitoring templates, and updated validation CI.
99 lines
3.2 KiB
Markdown
99 lines
3.2 KiB
Markdown
# Log Observability ELK/Grafana
|
|
|
|
## Problem
|
|
|
|
Operations teams need searchable logs and reviewable incident evidence in addition to simple OS checks. Zabbix is useful for host and service health signals; ELK/Grafana is better suited for log ingestion, error analysis, dashboards, and environment-level observability.
|
|
|
|
## CV Relevance
|
|
|
|
This project supports the monitoring and troubleshooting part of the CV by showing how incident logs can be collected, parsed, searched, and reviewed. It is separate from the Zabbix project: Zabbix handles simple checks, while this project focuses on logs and observability evidence.
|
|
|
|
## What This Project Demonstrates
|
|
|
|
- A local Docker Compose scaffold for Elasticsearch, Logstash, Kibana, Grafana, and Filebeat.
|
|
- Minimal configs required for the stack to validate independently.
|
|
- Sample logs and alert intent that can be reviewed without starting the full stack.
|
|
- An incident simulation script for generating operational log evidence.
|
|
|
|
This is a local demo stack. The default credentials are for non-production use only.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Application/System Logs -> Filebeat -> Logstash -> Elasticsearch -> Kibana
|
|
|
|
|
v
|
|
Grafana
|
|
|
|
Incident Scenario -> Sample Logs -> Alert Rules -> Operator Review
|
|
```
|
|
|
|
Core components:
|
|
|
|
- `docker-compose.yml` defines the observability services.
|
|
- `alerting/alert_rules.yml` records alert intent and severity.
|
|
- `examples/` contains representative operational logs and alert output.
|
|
- `scenarios/incident_simulation.sh` emits incident activity.
|
|
- `grafana/`, `kibana/`, `logstash/`, `filebeat/`, and `elasticsearch/` contain minimal local configs.
|
|
|
|
## Quickstart
|
|
|
|
```bash
|
|
cd professional-infra/log-observability-elk-grafana
|
|
make test
|
|
make demo
|
|
```
|
|
|
|
Start the full local stack with Docker:
|
|
|
|
```bash
|
|
make test
|
|
make run
|
|
make down
|
|
```
|
|
|
|
When running locally:
|
|
|
|
- Kibana: `http://localhost:5601`
|
|
- Grafana: `http://localhost:3000`
|
|
- Elasticsearch: `http://localhost:9200`
|
|
|
|
Default demo credentials:
|
|
|
|
- Elasticsearch/Kibana: `elastic` / `elastic`
|
|
- Grafana: `admin` / `admin`
|
|
|
|
## Validation
|
|
|
|
```bash
|
|
make test
|
|
docker compose config --quiet
|
|
```
|
|
|
|
`make test` also checks that all bind-mounted config files and directories exist.
|
|
|
|
## Example Output
|
|
|
|
```text
|
|
[2026-04-29 04:18:23] WARN Database connection pool nearing capacity
|
|
[2026-04-29 04:18:28] ERROR Database connection pool exhausted
|
|
[2026-04-29 04:18:33] ERROR Database query timeout occurred
|
|
[2026-04-29 04:18:44] INFO Database connections restored
|
|
```
|
|
|
|
Additional examples are available in [examples/alert-output.txt](examples/alert-output.txt) and [examples/sample-log.txt](examples/sample-log.txt).
|
|
|
|
## Interview Talking Points
|
|
|
|
- When to use Zabbix checks versus ELK log analysis.
|
|
- How Filebeat, Logstash, and Elasticsearch fit into a basic log pipeline.
|
|
- How incident simulations create evidence for troubleshooting discussions.
|
|
- Why local demo credentials and single-node Elasticsearch are not production architecture.
|
|
|
|
## Roadmap
|
|
|
|
- Add curated Grafana and Kibana dashboards.
|
|
- Add Prometheus metrics collection.
|
|
- Add distributed tracing with Jaeger or OpenTelemetry.
|
|
- Add synthetic monitoring checks.
|