Polish infrastructure portfolio projects
ci / validate (push) Waiting to run

This commit is contained in:
Mateusz Suski
2026-04-29 23:30:30 +00:00
parent b0537b4bff
commit 8783892241
34 changed files with 762 additions and 1226 deletions
+30 -363
View File
@@ -1,389 +1,56 @@
# Migration Validation Framework
A comprehensive Python CLI tool for validating system migrations through data collection, snapshot comparison, and automated reporting. Designed for enterprise migration workflows where system consistency and data integrity are critical.
## Problem Statement
## Overview
Infrastructure migrations often fail in small, expensive ways: a mount option changes, a service is disabled, or disk usage moves past an operational threshold. Teams need structured evidence that the migrated host still matches the expected operating profile.
The Migration Validation Framework provides a systematic approach to validating system migrations by:
## Solution Overview
- Collecting comprehensive system data before and after migration
- Generating structured JSON snapshots for comparison
- Performing intelligent diff analysis between snapshots
- Generating detailed HTML reports with change visualization
- Providing CLI interface for integration into migration pipelines
This project provides a Python CLI that collects system state into JSON snapshots and compares before/after files. The output is designed for change records, migration gates, and post-cutover validation.
## Architecture
## Architecture Overview
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ CLI Interface │ │ Data │ │ Validation │
│ (cli.py) │◄──►│ Collectors │◄──►│ Engine │
│ │ │ │ │ │
│ - Command │ │ - mounts.py │ │ - compare.py │
│ parsing │ │ - services.py │ │ - diff.py │
│ - Workflow │ │ - disk_usage.py │ │ - validate.py │
│ orchestration │ │ - network.py │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ JSON │ │ Comparison │ │ HTML │
│ Snapshots │ │ Results │ │ Reports │
│ │ │ │ │ │
│ - Pre-migration │ │ - Differences │ │ - Summary │
│ - Post-migration│ │ - Risk levels │ │ - Details │
│ - Metadata │ │ - Validation │ │ - Charts │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Operator -> CLI -> Collectors -> JSON Snapshot -> Comparator -> Diff/Report
```
## Quick Start
Core components:
### Prerequisites
- `cli.py` provides collect, compare, snapshot, list, and report commands.
- `collectors/` gathers mounts, services, and disk usage.
- `validators/compare.py` identifies drift and validation failures.
- `reports/` contains report generation helpers.
- `examples/` contains realistic before/after evidence.
- Python 3.8+
- SSH access to target systems
- Appropriate permissions for data collection
### Installation
## How to Run
```bash
cd migration-validation-framework
pip install -r requirements.txt
python3 cli.py collect --output before.json --systems web01,db01
python3 cli.py collect --output after.json --systems web01,db01
python3 cli.py compare before.json after.json --output diff.json
python3 cli.py compare examples/before.json examples/after.json --output /tmp/migration-diff.json
```
### Basic Usage
Legacy snapshot IDs are still supported:
```bash
# Create pre-migration snapshot
python cli.py snapshot --env production --label pre-migration --systems web01,db01
# Perform migration...
# Create post-migration snapshot
python cli.py snapshot --env production --label post-migration --systems web01,db01
# Compare snapshots
python cli.py compare pre-migration post-migration --output comparison_001
# Generate HTML report
python cli.py report --comparison comparison_001 --format html --output migration_report.html
python3 cli.py snapshot --env prod --label pre --systems web01,db01
python3 cli.py compare prod-pre-20260429_020000 prod-post-20260429_030000 --output change-0429
```
## Project Structure
## Example Output
```
migration-validation-framework/
├── cli.py # Main CLI interface
├── collectors/ # Data collection modules
│ ├── mounts.py # Filesystem mount collection
│ ├── services.py # System services collection
│ ├── disk_usage.py # Disk usage statistics
│ ├── network.py # Network configuration
│ └── processes.py # Running processes
├── validators/ # Validation and comparison logic
│ ├── compare.py # Snapshot comparison engine
│ ├── diff.py # Difference calculation
│ └── validate.py # Validation rules
├── reports/ # Report generation
│ ├── html_report.py # HTML report generator
│ ├── json_report.py # JSON report generator
│ └── summary.py # Summary calculations
├── config/ # Configuration files
│ ├── collectors.yaml # Collector configurations
│ └── validators.yaml # Validation rules
├── tests/ # Unit and integration tests
├── logs/ # Application logs
└── snapshots/ # Stored snapshots
```text
Comparison completed: diff.json (FAIL)
Overall risk: high
Total changes: 4
Failed checks: critical_services_running
Recommendation: restore sshd before production cutover
```
## Data Collectors
Sample inputs and output are available in [examples/before.json](examples/before.json), [examples/after.json](examples/after.json), and [examples/diff.json](examples/diff.json).
### Mounts Collector (`collectors/mounts.py`)
Collects filesystem mount information including:
- Mount points and devices
- Filesystem types
- Mount options
- Capacity and usage statistics
## Real-World Use Case
### Services Collector (`collectors/services.py`)
Gathers system service status:
- Running services
- Service states (active, inactive, failed)
- Startup configuration
- Dependencies
### Disk Usage Collector (`collectors/disk_usage.py`)
Analyzes disk space utilization:
- Directory size statistics
- File system usage
- Inode usage
- Largest files and directories
### Network Collector (`collectors/network.py`)
Captures network configuration:
- Interface configurations
- Routing tables
- DNS settings
- Firewall rules
### Processes Collector (`collectors/processes.py`)
Documents running processes:
- Process lists with PIDs
- Memory and CPU usage
- Process owners
- Command lines
## Validation Engine
### Comparison Logic (`validators/compare.py`)
Performs intelligent comparison of snapshots:
- Structural differences detection
- Semantic change analysis
- Risk level assessment
- Change categorization
### Difference Calculator (`validators/diff.py`)
Calculates detailed differences:
- Added/removed/modified items
- Quantitative changes
- Configuration drift detection
- Anomaly identification
### Validation Rules (`validators/validate.py`)
Applies validation rules:
- Critical change detection
- Compliance checking
- Threshold validation
- Custom rule engine
## Reporting
### HTML Reports (`reports/html_report.py`)
Generates comprehensive HTML reports featuring:
- Executive summary dashboard
- Detailed change logs
- Risk assessment visualizations
- Interactive charts and graphs
- Export capabilities
### JSON Reports (`reports/json_report.py`)
Provides structured JSON output for:
- API integration
- Automated processing
- Audit trails
- Compliance reporting
## CLI Interface
### Commands
```bash
# Snapshot management
python cli.py snapshot --env <env> --label <label> [--systems <hosts>]
python cli.py list-snapshots [--env <env>]
python cli.py delete-snapshot <snapshot-id>
# Comparison operations
python cli.py compare <snapshot1> <snapshot2> [--output <comparison-id>]
python cli.py list-comparisons
python cli.py show-comparison <comparison-id>
# Reporting
python cli.py report --comparison <comparison-id> --format <format> [--output <file>]
python cli.py export --comparison <comparison-id> --format <format>
# Configuration
python cli.py config --show
python cli.py config --set <key> <value>
```
### Options
- `--env`: Target environment (production, staging, development)
- `--systems`: Comma-separated list of target systems
- `--parallel`: Number of parallel collection threads
- `--timeout`: Collection timeout in seconds
- `--verbose`: Enable verbose output
- `--dry-run`: Preview actions without execution
## Configuration
### Collector Configuration (`config/collectors.yaml`)
```yaml
collectors:
mounts:
enabled: true
timeout: 30
exclude_patterns:
- "/proc/*"
- "/sys/*"
services:
enabled: true
include_disabled: false
service_manager: systemd
disk_usage:
enabled: true
max_depth: 3
exclude_paths:
- "/tmp"
- "/var/log"
```
### Validation Rules (`config/validators.yaml`)
```yaml
rules:
critical_services:
- sshd
- systemd
- network
filesystem_thresholds:
warning: 80
critical: 95
network_changes:
allow_new_interfaces: false
allow_route_changes: false
```
## Examples
### Complete Migration Validation Workflow
```bash
# 1. Pre-migration snapshot
python cli.py snapshot --env production --label "migration-pre-20241201" \
--systems web01,web02,db01,lb01 --parallel 4
# 2. Execute migration process
# ... migration steps ...
# 3. Post-migration snapshot
python cli.py snapshot --env production --label "migration-post-20241201" \
--systems web01,web02,db01,lb01 --parallel 4
# 4. Compare snapshots
python cli.py compare migration-pre-20241201 migration-post-20241201 \
--output migration-dec2024
# 5. Generate reports
python cli.py report --comparison migration-dec2024 --format html \
--output migration_validation_report.html
python cli.py report --comparison migration-dec2024 --format json \
--output migration_validation_data.json
```
### Automated Validation in CI/CD
```bash
#!/bin/bash
# CI/CD validation script
ENVIRONMENT=$1
SNAPSHOT_LABEL="ci-${BUILD_NUMBER}"
# Create snapshot
python cli.py snapshot --env $ENVIRONMENT --label $SNAPSHOT_LABEL
# Compare with baseline
python cli.py compare baseline-$ENVIRONMENT $SNAPSHOT_LABEL --output ci-$BUILD_NUMBER
# Generate report
python cli.py report --comparison ci-$BUILD_NUMBER --format html
# Check for critical changes
if python cli.py check-critical --comparison ci-$BUILD_NUMBER; then
echo "Migration validation passed"
exit 0
else
echo "Critical changes detected - review required"
exit 1
fi
```
## Security Considerations
- SSH key-based authentication only
- Encrypted snapshot storage
- Access control for sensitive data
- Audit logging of all operations
- Data sanitization and filtering
## Performance Optimization
- Parallel data collection
- Incremental snapshots
- Compressed storage
- Memory-efficient processing
- Timeout handling
## Monitoring and Logging
- Comprehensive logging to `logs/validation.log`
- Performance metrics collection
- Error tracking and alerting
- Audit trail generation
## Troubleshooting
### Common Issues
**Connection Failures:**
```bash
# Check SSH connectivity
ssh -i ~/.ssh/id_rsa user@target-host
# Verify Python availability
python cli.py --test-connection --systems target-host
```
**Collection Timeouts:**
```bash
# Increase timeout
python cli.py snapshot --timeout 300 --systems slow-host
# Check system load
ssh user@target-host uptime
```
**Permission Errors:**
```bash
# Verify sudo access
ssh user@target-host sudo -l
# Check file permissions
ssh user@target-host ls -la /etc/
```
## Development
### Adding New Collectors
1. Create collector module in `collectors/`
2. Implement collection logic
3. Add configuration schema
4. Update CLI interface
5. Add unit tests
### Custom Validation Rules
1. Define rules in `config/validators.yaml`
2. Implement validation logic in `validators/`
3. Update report generation
4. Test with sample data
## Contributing
1. Follow existing code structure and naming conventions
2. Add comprehensive tests for new functionality
3. Update documentation for API changes
4. Ensure backward compatibility
## License
Enterprise Internal Use Only
During a data center migration, a platform team can collect baseline state before cutover, collect the same evidence after DNS or workload migration, and attach the diff to the change ticket. The framework gives reviewers a compact signal on whether the host is ready for production traffic.