Files
portfolio/enterprise-infra-simulator

Enterprise Infrastructure Simulator

A container-based simulation environment for enterprise Linux infrastructure operations. This project provides Ansible automation for provisioning, patching, hardening, and decommissioning of simulated Linux nodes, along with scripts for scaling and failure simulation.

Overview

The Enterprise Infrastructure Simulator creates a realistic environment for testing and demonstrating infrastructure automation at scale. It uses Docker containers to simulate multiple Linux nodes and provides comprehensive Ansible playbooks for enterprise operations.

Architecture

  • Container Simulation: Docker-based Linux nodes with realistic configurations
  • Ansible Automation: Modular playbooks for infrastructure lifecycle management
  • Dynamic Inventory: Automated host discovery and grouping
  • Simulation Scripts: Automated scaling and failure injection
  • Scenario Management: Pre-defined operational scenarios

Quick Start

Prerequisites

  • Docker and Docker Compose
  • Ansible 2.9+
  • Make

Setup

# Clone and navigate to project
cd enterprise-infra-simulator

# Start the infrastructure
make up

# Verify deployment
ansible -i inventory/hosts.ini all -m ping

Available Operations

Infrastructure Management

# Provision new nodes
ansible-playbook -i inventory/hosts.ini playbooks/provision.yml

# Apply security patches
make patch

# Harden systems
ansible-playbook -i inventory/hosts.ini playbooks/harden.yml

# Decommission nodes
ansible-playbook -i inventory/hosts.ini playbooks/decommission.yml

# Destroy infrastructure
make destroy

Simulation Operations

# Scale up infrastructure
./scripts/simulate_scaling.sh up 5

# Simulate network failure
./scripts/simulate_failure.sh --type network --duration 300

# Run operational scenario
ansible-playbook -i inventory/hosts.ini scenarios/scaling_event.yml

Project Structure

enterprise-infra-simulator/
├── inventory/              # Ansible inventory files
│   └── hosts.ini          # Dynamic host inventory
├── playbooks/             # Ansible automation playbooks
│   ├── provision.yml      # Node provisioning
│   ├── patch.yml          # Security patching
│   ├── harden.yml         # Security hardening
│   └── decommission.yml   # Node decommissioning
├── scripts/               # Simulation and utility scripts
│   ├── simulate_scaling.sh    # Infrastructure scaling
│   └── simulate_failure.sh    # Failure injection
├── scenarios/             # Operational scenarios
│   └── scaling_event.yml  # Scaling scenario
├── docker-compose.yml     # Container orchestration
├── Makefile              # Build automation
└── README.md

Inventory Management

The simulator uses dynamic inventory with the following groups:

  • webservers: Web application servers
  • databases: Database servers
  • loadbalancers: Load balancing infrastructure
  • monitoring: Monitoring and logging servers

Playbooks

Provision Playbook

  • Creates Docker containers with base Linux configurations
  • Installs required packages and services
  • Configures basic networking and security
  • Registers nodes in inventory

Patch Playbook

  • Updates system packages
  • Applies security patches
  • Restarts services as needed
  • Generates patch reports

Harden Playbook

  • Implements CIS security benchmarks
  • Configures firewall rules
  • Hardens SSH configuration
  • Disables unnecessary services

Decommission Playbook

  • Gracefully stops services
  • Exports configuration and data
  • Removes containers
  • Cleans up inventory

Simulation Scripts

Scaling Simulation

./scripts/simulate_scaling.sh [up|down] [count] [type]

Parameters:

  • direction: up/down
  • count: Number of nodes to add/remove
  • type: Node type (web/db/lb/monitor)

Failure Simulation

./scripts/simulate_failure.sh --type [failure_type] --duration [seconds]

Failure Types:

  • network: Network connectivity issues
  • disk: Disk space exhaustion
  • service: Service failures
  • node: Complete node outages

Scenarios

Pre-defined operational scenarios for testing:

  • Scaling Event: Automated scaling during traffic spikes
  • Disaster Recovery: Node failure and recovery procedures
  • Maintenance Window: Scheduled patching and updates
  • Security Incident: Breach simulation and response

Configuration

Environment Variables

# Number of initial nodes
INFRA_NODE_COUNT=3

# Node types to deploy
INFRA_NODE_TYPES=web,db,lb

# Simulation parameters
SIMULATION_DURATION=3600
SIMULATION_INTENSITY=medium

Docker Configuration

Container resources and networking are configured in docker-compose.yml:

services:
  infra-node:
    image: ubuntu:20.04
    deploy:
      replicas: 3
      resources:
        limits:
          memory: 512M
          cpus: '0.5'

Monitoring and Logging

  • Ansible execution logs: ansible.log
  • Container logs: docker logs <container-name>
  • Simulation logs: logs/simulation.log

Troubleshooting

Common Issues

Ansible Connection Failures:

# Check container status
docker ps | grep infra-sim

# Verify SSH connectivity
ansible -i inventory/hosts.ini all -m ping

Container Resource Issues:

# Check Docker resources
docker system df

# Clean up containers
docker system prune

Simulation Script Errors:

# Check script permissions
chmod +x scripts/*.sh

# Verify dependencies
./scripts/simulate_failure.sh --help

Development

Adding New Playbooks

  1. Create playbook in playbooks/ directory
  2. Follow Ansible best practices
  3. Test with --check mode
  4. Update documentation

Custom Scenarios

  1. Define scenario in scenarios/ directory
  2. Include required variables
  3. Test with dry-run
  4. Document operational procedures

Security Considerations

  • Containers run with limited privileges
  • SSH keys are generated per deployment
  • Firewall rules are applied automatically
  • Security scanning integrated in CI/CD

Performance Optimization

  • Container resource limits prevent resource exhaustion
  • Ansible parallel execution for faster operations
  • Efficient failure simulation without full outages
  • Optimized Docker layer caching

Contributing

  1. Follow existing code structure and naming conventions
  2. Add comprehensive documentation
  3. Include tests for new functionality
  4. Update runbooks for operational changes

License

Enterprise Internal Use Only