Polish infrastructure portfolio projects
ci / validate (push) Waiting to run

This commit is contained in:
Mateusz Suski
2026-04-29 23:30:30 +00:00
parent b0537b4bff
commit 8783892241
34 changed files with 762 additions and 1226 deletions
+44 -238
View File
@@ -1,268 +1,74 @@
# Enterprise Infrastructure Simulator
A container-based simulation environment for enterprise Linux infrastructure operations. This project provides Ansible automation for provisioning, patching, hardening, and decommissioning of simulated Linux nodes, along with scripts for scaling and failure simulation.
## Problem Statement
## Overview
Infrastructure teams need a safe place to rehearse lifecycle operations before applying them to production fleets. Patch windows, hardening changes, scale events, and node failures all carry operational risk when they are tested only during real incidents.
The Enterprise Infrastructure Simulator creates a realistic environment for testing and demonstrating infrastructure automation at scale. It uses Docker containers to simulate multiple Linux nodes and provides comprehensive Ansible playbooks for enterprise operations.
## Solution Overview
## Architecture
This project models common Linux infrastructure operations with Ansible playbooks and shell-based simulations. It keeps the automation readable and auditable while producing example evidence that resembles a real change record.
- **Container Simulation:** Docker-based Linux nodes with realistic configurations
- **Ansible Automation:** Modular playbooks for infrastructure lifecycle management
- **Dynamic Inventory:** Automated host discovery and grouping
- **Simulation Scripts:** Automated scaling and failure injection
- **Scenario Management:** Pre-defined operational scenarios
## Architecture Overview
## Quick Start
```
Operator -> Make/CLI -> Ansible Inventory -> Playbooks -> Linux Nodes
| |
v v
Scenarios Reports/Logs
```
### Prerequisites
Core components:
- Docker and Docker Compose
- Ansible 2.9+
- Make
- `inventory/hosts.ini` defines managed node groups.
- `playbooks/` contains provisioning, patching, hardening, and decommissioning workflows.
- `scripts/` injects scaling and failure conditions.
- `scenarios/` documents operational exercises.
- `examples/` stores representative outputs for review.
### Setup
## How to Run
```bash
# Clone and navigate to project
cd enterprise-infra-simulator
# Start the infrastructure
make up
# Validate playbook syntax.
make test
# Verify deployment
ansible -i inventory/hosts.ini all -m ping
```
# Provision the simulated estate.
make run
## Available Operations
### Infrastructure Management
```bash
# Provision new nodes
ansible-playbook -i inventory/hosts.ini playbooks/provision.yml
# Apply security patches
# Apply security patches.
make patch
# Harden systems
ansible-playbook -i inventory/hosts.ini playbooks/harden.yml
# Apply host hardening.
make harden
# Decommission nodes
ansible-playbook -i inventory/hosts.ini playbooks/decommission.yml
# Destroy infrastructure
make destroy
# Run the failure and patch demo.
make demo
```
### Simulation Operations
Direct Ansible commands are also supported:
```bash
# Scale up infrastructure
./scripts/simulate_scaling.sh up 5
# Simulate network failure
./scripts/simulate_failure.sh --type network --duration 300
# Run operational scenario
ansible-playbook -i inventory/hosts.ini scenarios/scaling_event.yml
ansible-playbook -i inventory/hosts.ini playbooks/provision.yml
ansible-playbook -i inventory/hosts.ini playbooks/patch.yml
ansible-playbook -i inventory/hosts.ini playbooks/hardening.yml
```
## Project Structure
## Example Output
```
enterprise-infra-simulator/
├── inventory/ # Ansible inventory files
│ └── hosts.ini # Dynamic host inventory
├── playbooks/ # Ansible automation playbooks
│ ├── provision.yml # Node provisioning
│ ├── patch.yml # Security patching
│ ├── harden.yml # Security hardening
│ └── decommission.yml # Node decommissioning
├── scripts/ # Simulation and utility scripts
│ ├── simulate_scaling.sh # Infrastructure scaling
│ └── simulate_failure.sh # Failure injection
├── scenarios/ # Operational scenarios
│ └── scaling_event.yml # Scaling scenario
├── docker-compose.yml # Container orchestration
├── Makefile # Build automation
└── README.md
```text
PLAY RECAP *********************************************************************
web01 : ok=21 changed=7 unreachable=0 failed=0 skipped=3 rescued=0 ignored=1
db01 : ok=18 changed=4 unreachable=0 failed=0 skipped=5 rescued=0 ignored=1
lb01 : ok=16 changed=3 unreachable=0 failed=0 skipped=6 rescued=0 ignored=0
Patch status: SUCCESS
Updates applied: 12
Reboot required: false
```
## Inventory Management
Additional sample evidence is available in [examples/patch-output.txt](examples/patch-output.txt) and [examples/failure-simulation.txt](examples/failure-simulation.txt).
The simulator uses dynamic inventory with the following groups:
## Real-World Use Case
- `webservers`: Web application servers
- `databases`: Database servers
- `loadbalancers`: Load balancing infrastructure
- `monitoring`: Monitoring and logging servers
## Playbooks
### Provision Playbook
- Creates Docker containers with base Linux configurations
- Installs required packages and services
- Configures basic networking and security
- Registers nodes in inventory
### Patch Playbook
- Updates system packages
- Applies security patches
- Restarts services as needed
- Generates patch reports
### Harden Playbook
- Implements CIS security benchmarks
- Configures firewall rules
- Hardens SSH configuration
- Disables unnecessary services
### Decommission Playbook
- Gracefully stops services
- Exports configuration and data
- Removes containers
- Cleans up inventory
## Simulation Scripts
### Scaling Simulation
```bash
./scripts/simulate_scaling.sh [up|down] [count] [type]
```
Parameters:
- `direction`: up/down
- `count`: Number of nodes to add/remove
- `type`: Node type (web/db/lb/monitor)
### Failure Simulation
```bash
./scripts/simulate_failure.sh --type [failure_type] --duration [seconds]
```
Failure Types:
- `network`: Network connectivity issues
- `disk`: Disk space exhaustion
- `service`: Service failures
- `node`: Complete node outages
## Scenarios
Pre-defined operational scenarios for testing:
- **Scaling Event:** Automated scaling during traffic spikes
- **Disaster Recovery:** Node failure and recovery procedures
- **Maintenance Window:** Scheduled patching and updates
- **Security Incident:** Breach simulation and response
## Configuration
### Environment Variables
```bash
# Number of initial nodes
INFRA_NODE_COUNT=3
# Node types to deploy
INFRA_NODE_TYPES=web,db,lb
# Simulation parameters
SIMULATION_DURATION=3600
SIMULATION_INTENSITY=medium
```
### Docker Configuration
Container resources and networking are configured in `docker-compose.yml`:
```yaml
services:
infra-node:
image: ubuntu:20.04
deploy:
replicas: 3
resources:
limits:
memory: 512M
cpus: '0.5'
```
## Monitoring and Logging
- Ansible execution logs: `ansible.log`
- Container logs: `docker logs <container-name>`
- Simulation logs: `logs/simulation.log`
## Troubleshooting
### Common Issues
**Ansible Connection Failures:**
```bash
# Check container status
docker ps | grep infra-sim
# Verify SSH connectivity
ansible -i inventory/hosts.ini all -m ping
```
**Container Resource Issues:**
```bash
# Check Docker resources
docker system df
# Clean up containers
docker system prune
```
**Simulation Script Errors:**
```bash
# Check script permissions
chmod +x scripts/*.sh
# Verify dependencies
./scripts/simulate_failure.sh --help
```
## Development
### Adding New Playbooks
1. Create playbook in `playbooks/` directory
2. Follow Ansible best practices
3. Test with `--check` mode
4. Update documentation
### Custom Scenarios
1. Define scenario in `scenarios/` directory
2. Include required variables
3. Test with dry-run
4. Document operational procedures
## Security Considerations
- Containers run with limited privileges
- SSH keys are generated per deployment
- Firewall rules are applied automatically
- Security scanning integrated in CI/CD
## Performance Optimization
- Container resource limits prevent resource exhaustion
- Ansible parallel execution for faster operations
- Efficient failure simulation without full outages
- Optimized Docker layer caching
## Contributing
1. Follow existing code structure and naming conventions
2. Add comprehensive documentation
3. Include tests for new functionality
4. Update runbooks for operational changes
## License
Enterprise Internal Use Only
A platform team can use this project to demonstrate how routine operating procedures are encoded, reviewed, and tested before production change windows. The same patterns apply to regulated Linux estates where patch evidence, hardening controls, and incident drills must be repeatable.