Polish infrastructure portfolio projects

2026-04-29 23:30:30 +00:00
parent b0537b4bff
commit 8783892241
34 changed files with 762 additions and 1226 deletions
@@ -1,389 +1,56 @@
 # Migration Validation Framework

-A comprehensive Python CLI tool for validating system migrations through data collection, snapshot comparison, and automated reporting. Designed for enterprise migration workflows where system consistency and data integrity are critical.
+## Problem Statement

-## Overview
+Infrastructure migrations often fail in small, expensive ways: a mount option changes, a service is disabled, or disk usage moves past an operational threshold. Teams need structured evidence that the migrated host still matches the expected operating profile.

-The Migration Validation Framework provides a systematic approach to validating system migrations by:
+## Solution Overview

- Collecting comprehensive system data before and after migration
- Generating structured JSON snapshots for comparison
- Performing intelligent diff analysis between snapshots
- Generating detailed HTML reports with change visualization
- Providing CLI interface for integration into migration pipelines
+This project provides a Python CLI that collects system state into JSON snapshots and compares before/after files. The output is designed for change records, migration gates, and post-cutover validation.

-## Architecture
+## Architecture Overview

 ```
-┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
-│   CLI Interface │    │   Data          │    │   Validation    │
-│   (cli.py)      │◄──►│   Collectors    │◄──►│   Engine         │
-│                 │    │                 │    │                 │
-│ - Command       │    │ - mounts.py     │    │ - compare.py    │
-│   parsing       │    │ - services.py   │    │ - diff.py       │
-│ - Workflow      │    │ - disk_usage.py │    │ - validate.py   │
-│   orchestration │    │ - network.py    │    │                 │
-└─────────────────┘    └─────────────────┘    └─────────────────┘
-         │                       │                       │
-         ▼                       ▼                       ▼
-┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
-│   JSON          │    │   Comparison    │    │   HTML          │
-│   Snapshots     │    │   Results       │    │   Reports       │
-│                 │    │                 │    │                 │
-│ - Pre-migration │    │ - Differences   │    │ - Summary       │
-│ - Post-migration│    │ - Risk levels   │    │ - Details       │
-│ - Metadata      │    │ - Validation    │    │ - Charts        │
-└─────────────────┘    └─────────────────┘    └─────────────────┘
+Operator -> CLI -> Collectors -> JSON Snapshot -> Comparator -> Diff/Report
 ```

-## Quick Start
+Core components:

-### Prerequisites
+- `cli.py` provides collect, compare, snapshot, list, and report commands.
+- `collectors/` gathers mounts, services, and disk usage.
+- `validators/compare.py` identifies drift and validation failures.
+- `reports/` contains report generation helpers.
+- `examples/` contains realistic before/after evidence.

- Python 3.8+
- SSH access to target systems
- Appropriate permissions for data collection
-
-### Installation
+## How to Run

 ```bash
 cd migration-validation-framework
-pip install -r requirements.txt
+python3 cli.py collect --output before.json --systems web01,db01
+python3 cli.py collect --output after.json --systems web01,db01
+python3 cli.py compare before.json after.json --output diff.json
+python3 cli.py compare examples/before.json examples/after.json --output /tmp/migration-diff.json
 ```

-### Basic Usage
+Legacy snapshot IDs are still supported:

 ```bash
-# Create pre-migration snapshot
-python cli.py snapshot --env production --label pre-migration --systems web01,db01
-
-# Perform migration...
-
-# Create post-migration snapshot
-python cli.py snapshot --env production --label post-migration --systems web01,db01
-
-# Compare snapshots
-python cli.py compare pre-migration post-migration --output comparison_001
-
-# Generate HTML report
-python cli.py report --comparison comparison_001 --format html --output migration_report.html
+python3 cli.py snapshot --env prod --label pre --systems web01,db01
+python3 cli.py compare prod-pre-20260429_020000 prod-post-20260429_030000 --output change-0429
 ```

-## Project Structure
+## Example Output

-```
-migration-validation-framework/
-├── cli.py                 # Main CLI interface
-├── collectors/           # Data collection modules
-│   ├── mounts.py        # Filesystem mount collection
-│   ├── services.py      # System services collection
-│   ├── disk_usage.py    # Disk usage statistics
-│   ├── network.py       # Network configuration
-│   └── processes.py     # Running processes
-├── validators/          # Validation and comparison logic
-│   ├── compare.py       # Snapshot comparison engine
-│   ├── diff.py          # Difference calculation
-│   └── validate.py      # Validation rules
-├── reports/             # Report generation
-│   ├── html_report.py   # HTML report generator
-│   ├── json_report.py   # JSON report generator
-│   └── summary.py       # Summary calculations
-├── config/              # Configuration files
-│   ├── collectors.yaml  # Collector configurations
-│   └── validators.yaml  # Validation rules
-├── tests/               # Unit and integration tests
-├── logs/                # Application logs
-└── snapshots/           # Stored snapshots
+```text
+Comparison completed: diff.json (FAIL)
+Overall risk: high
+Total changes: 4
+Failed checks: critical_services_running
+Recommendation: restore sshd before production cutover
 ```

-## Data Collectors
+Sample inputs and output are available in [examples/before.json](examples/before.json), [examples/after.json](examples/after.json), and [examples/diff.json](examples/diff.json).

-### Mounts Collector (`collectors/mounts.py`)
-Collects filesystem mount information including:
- Mount points and devices
- Filesystem types
- Mount options
- Capacity and usage statistics
+## Real-World Use Case

-### Services Collector (`collectors/services.py`)
-Gathers system service status:
- Running services
- Service states (active, inactive, failed)
- Startup configuration
- Dependencies
-
-### Disk Usage Collector (`collectors/disk_usage.py`)
-Analyzes disk space utilization:
- Directory size statistics
- File system usage
- Inode usage
- Largest files and directories
-
-### Network Collector (`collectors/network.py`)
-Captures network configuration:
- Interface configurations
- Routing tables
- DNS settings
- Firewall rules
-
-### Processes Collector (`collectors/processes.py`)
-Documents running processes:
- Process lists with PIDs
- Memory and CPU usage
- Process owners
- Command lines
-
-## Validation Engine
-
-### Comparison Logic (`validators/compare.py`)
-Performs intelligent comparison of snapshots:
- Structural differences detection
- Semantic change analysis
- Risk level assessment
- Change categorization
-
-### Difference Calculator (`validators/diff.py`)
-Calculates detailed differences:
- Added/removed/modified items
- Quantitative changes
- Configuration drift detection
- Anomaly identification
-
-### Validation Rules (`validators/validate.py`)
-Applies validation rules:
- Critical change detection
- Compliance checking
- Threshold validation
- Custom rule engine
-
-## Reporting
-
-### HTML Reports (`reports/html_report.py`)
-Generates comprehensive HTML reports featuring:
- Executive summary dashboard
- Detailed change logs
- Risk assessment visualizations
- Interactive charts and graphs
- Export capabilities
-
-### JSON Reports (`reports/json_report.py`)
-Provides structured JSON output for:
- API integration
- Automated processing
- Audit trails
- Compliance reporting
-
-## CLI Interface
-
-### Commands
-
-```bash
-# Snapshot management
-python cli.py snapshot --env <env> --label <label> [--systems <hosts>]
-python cli.py list-snapshots [--env <env>]
-python cli.py delete-snapshot <snapshot-id>
-
-# Comparison operations
-python cli.py compare <snapshot1> <snapshot2> [--output <comparison-id>]
-python cli.py list-comparisons
-python cli.py show-comparison <comparison-id>
-
-# Reporting
-python cli.py report --comparison <comparison-id> --format <format> [--output <file>]
-python cli.py export --comparison <comparison-id> --format <format>
-
-# Configuration
-python cli.py config --show
-python cli.py config --set <key> <value>
-```
-
-### Options
-
- `--env`: Target environment (production, staging, development)
- `--systems`: Comma-separated list of target systems
- `--parallel`: Number of parallel collection threads
- `--timeout`: Collection timeout in seconds
- `--verbose`: Enable verbose output
- `--dry-run`: Preview actions without execution
-
-## Configuration
-
-### Collector Configuration (`config/collectors.yaml`)
-
-```yaml
-collectors:
-  mounts:
-    enabled: true
-    timeout: 30
-    exclude_patterns:
-      - "/proc/*"
-      - "/sys/*"
-
-  services:
-    enabled: true
-    include_disabled: false
-    service_manager: systemd
-
-  disk_usage:
-    enabled: true
-    max_depth: 3
-    exclude_paths:
-      - "/tmp"
-      - "/var/log"
-```
-
-### Validation Rules (`config/validators.yaml`)
-
-```yaml
-rules:
-  critical_services:
-    - sshd
-    - systemd
-    - network
-
-  filesystem_thresholds:
-    warning: 80
-    critical: 95
-
-  network_changes:
-    allow_new_interfaces: false
-    allow_route_changes: false
-```
-
-## Examples
-
-### Complete Migration Validation Workflow
-
-```bash
-# 1. Pre-migration snapshot
-python cli.py snapshot --env production --label "migration-pre-20241201" \
-    --systems web01,web02,db01,lb01 --parallel 4
-
-# 2. Execute migration process
-# ... migration steps ...
-
-# 3. Post-migration snapshot
-python cli.py snapshot --env production --label "migration-post-20241201" \
-    --systems web01,web02,db01,lb01 --parallel 4
-
-# 4. Compare snapshots
-python cli.py compare migration-pre-20241201 migration-post-20241201 \
-    --output migration-dec2024
-
-# 5. Generate reports
-python cli.py report --comparison migration-dec2024 --format html \
-    --output migration_validation_report.html
-
-python cli.py report --comparison migration-dec2024 --format json \
-    --output migration_validation_data.json
-```
-
-### Automated Validation in CI/CD
-
-```bash
-#!/bin/bash
-# CI/CD validation script
-
-ENVIRONMENT=$1
-SNAPSHOT_LABEL="ci-${BUILD_NUMBER}"
-
-# Create snapshot
-python cli.py snapshot --env $ENVIRONMENT --label $SNAPSHOT_LABEL
-
-# Compare with baseline
-python cli.py compare baseline-$ENVIRONMENT $SNAPSHOT_LABEL --output ci-$BUILD_NUMBER
-
-# Generate report
-python cli.py report --comparison ci-$BUILD_NUMBER --format html
-
-# Check for critical changes
-if python cli.py check-critical --comparison ci-$BUILD_NUMBER; then
-    echo "Migration validation passed"
-    exit 0
-else
-    echo "Critical changes detected - review required"
-    exit 1
-fi
-```
-
-## Security Considerations
-
- SSH key-based authentication only
- Encrypted snapshot storage
- Access control for sensitive data
- Audit logging of all operations
- Data sanitization and filtering
-
-## Performance Optimization
-
- Parallel data collection
- Incremental snapshots
- Compressed storage
- Memory-efficient processing
- Timeout handling
-
-## Monitoring and Logging
-
- Comprehensive logging to `logs/validation.log`
- Performance metrics collection
- Error tracking and alerting
- Audit trail generation
-
-## Troubleshooting
-
-### Common Issues
-
-**Connection Failures:**
-```bash
-# Check SSH connectivity
-ssh -i ~/.ssh/id_rsa user@target-host
-
-# Verify Python availability
-python cli.py --test-connection --systems target-host
-```
-
-**Collection Timeouts:**
-```bash
-# Increase timeout
-python cli.py snapshot --timeout 300 --systems slow-host
-
-# Check system load
-ssh user@target-host uptime
-```
-
-**Permission Errors:**
-```bash
-# Verify sudo access
-ssh user@target-host sudo -l
-
-# Check file permissions
-ssh user@target-host ls -la /etc/
-```
-
-## Development
-
-### Adding New Collectors
-
-1. Create collector module in `collectors/`
-2. Implement collection logic
-3. Add configuration schema
-4. Update CLI interface
-5. Add unit tests
-
-### Custom Validation Rules
-
-1. Define rules in `config/validators.yaml`
-2. Implement validation logic in `validators/`
-3. Update report generation
-4. Test with sample data
-
-## Contributing
-
-1. Follow existing code structure and naming conventions
-2. Add comprehensive tests for new functionality
-3. Update documentation for API changes
-4. Ensure backward compatibility
-
-## License
-
-Enterprise Internal Use Only
+During a data center migration, a platform team can collect baseline state before cutover, collect the same evidence after DNS or workload migration, and attach the diff to the change ticket. The framework gives reviewers a compact signal on whether the host is ready for production traffic.