# Linux Fresh Setup Toolkit ## Executive summary The Linux Fresh Setup Toolkit is day-0 bootstrap automation for a clean Ubuntu lab server or workstation. It prepares a host for routine administration, Cockpit, Docker workloads, libvirt/KVM virtual machines, optional NVIDIA diagnostics, bounded logging, practical kernel tuning, and a conservative security baseline. The scripts are modular and safe to rerun. Optional components remain optional, UFW is not enabled without a specific flag, and an NVIDIA driver is never installed without an explicit version. This is a portfolio and homelab implementation, not a production-certified build standard. ## Scope and non-goals The toolkit supports Ubuntu 24.04 and newer and assumes a systemd-based host with APT package management. It is suitable for a host such as `ailab` that may run WebODM, Open WebUI, Homepage, NVIDIA workloads, or test virtual machines. It does not: - Deploy applications, containers, or virtual machines. - Configure GPU passthrough, VFIO bindings, bridges, or Windows guests. - Select an NVIDIA driver automatically. - Define a complete firewall policy or compliance baseline. - Replace backup, monitoring, patching, or ongoing maintenance processes. - Claim live validation against every future Ubuntu release. ## Why this is separate from ailab-maintenance This project establishes a fresh host. The sibling [AI Lab Maintenance Toolkit](../ailab-maintenance/) handles day-2 health checks, scheduled cleanup, configuration backup, disk monitoring, and VM inventory after a host is operating. Keeping bootstrap and maintenance separate makes the change boundary clear: this toolkit installs platform capabilities and baseline configuration, while the maintenance toolkit manages recurring operational tasks. ## Directory layout ```text setup/ ├── README.md ├── install.sh ├── scripts/ │ ├── 00-preflight.sh │ ├── 00-platform-guard.inc │ ├── 01-base-packages.sh │ ├── 02-shell-profile.sh │ ├── 03-cockpit.sh │ ├── 04-docker.sh │ ├── 05-libvirt.sh │ ├── 06-nvidia-tools.sh │ ├── 07-tuning.sh │ ├── 08-security-baseline.sh │ └── 99-postcheck.sh ├── files/ │ ├── bashrc.d/ailab.sh │ ├── docker/daemon.json │ ├── sysctl/99-ailab.conf │ └── systemd/journald-ailab-limits.conf └── docs/ ├── fresh-install-checklist.md ├── cockpit.md ├── docker.md ├── libvirt.md ├── nvidia.md └── bash-shell.md ``` `00-platform-guard.inc` is an internal sourced helper used by mutating component scripts; it is not an executable profile. ## Supported profiles and flags | Flag | Result | | --- | --- | | `--base` | Install operational CLI, diagnostic, storage, and network packages | | `--shell` | Install the root AI lab Bash profile | | `--cockpit` | Install and enable Cockpit | | `--docker` | Install Docker and bounded JSON-file logging | | `--libvirt` | Install and enable libvirt/KVM | | `--nvidia-tools` | Install NVIDIA and OpenCL diagnostics without a driver | | `--install-nvidia-driver VERSION` | Install diagnostics and the named Ubuntu driver package | | `--tuning` | Apply journald, sysctl, sensor, and sysstat settings | | `--security` | Install and enable fail2ban; install but do not enable UFW | | `--enable-ufw` | Run security setup and explicitly enable UFW | | `--all` | Run every standard profile without UFW enablement or driver installation | `--install-nvidia-driver` implies `--nvidia-tools`. `--enable-ufw` implies `--security`. With no flags, the installer prints help and makes no changes. ## Installation examples Review the scripts and current host access path before execution: ```bash cd labs/linux/setup ./install.sh sudo ./install.sh --base --shell sudo ./install.sh --cockpit --docker --libvirt sudo ./install.sh --all ``` Explicit high-impact options can be combined with `--all`: ```bash sudo ./install.sh --all --enable-ufw sudo ./install.sh --all --install-nvidia-driver 550 ``` The installer runs the read-only preflight once before selected profiles and a postcheck after all successful profile steps. ## Fresh host workflow 1. Patch the base Ubuntu installation and confirm console or out-of-band access. 2. Review [the fresh install checklist](docs/fresh-install-checklist.md). 3. Run `sudo ./install.sh --base --shell`. 4. Add only the platform profiles needed by the host. 5. Review service state, listening ports, storage, networking, and warnings in the postcheck. 6. Reboot if a driver or kernel-related package requires it. 7. Capture host-specific configuration and backup requirements separately. ## AI lab workflow A general AI lab host can start with: ```bash sudo ./install.sh --base --shell --cockpit --docker --nvidia-tools --tuning --security ``` This installs GPU diagnostics but leaves driver choice to the operator. Add libvirt only when the host will run VMs. Enable UFW only after confirming SSH, Cockpit, application, bridge, and VM networking requirements. ## Safety model - Mutating profiles require root and refuse non-Ubuntu systems or Ubuntu older than 24.04. - Component profiles install their own direct prerequisites. - Existing managed configuration is changed only when content differs. - Changed root shell, Docker, journald, and sysctl files receive timestamped backups. - Existing valid Docker JSON is merged so unrelated settings survive. - Invalid Docker JSON stops configuration rather than being overwritten. - UFW and NVIDIA driver installation require explicit flags. - Package and service failures are not hidden. - Postcheck warnings report optional or inactive components without masking a successfully completed diagnostic script. APT installation and service restarts are real system changes. Test first on a disposable host and maintain a console path when changing remote access policy. ## Bash shell profile The shell profile is installed as `/root/.bashrc.d/ailab.sh`, and one exact source line is maintained in `/root/.bashrc`. It adds concise helpers for systemd, journals, Docker, libvirt, NVIDIA, ports, archives, and disk usage. See [Bash shell profile](docs/bash-shell.md) for command details and cautions. ## Cockpit setup Cockpit provides browser-based host, storage, network, package, VM, metrics, and support-report views. The installer enables `cockpit.socket` and reports `https://HOSTNAME:9090`. `cockpit-files` is optional because it is not available in every enabled Ubuntu repository. See [Cockpit setup](docs/cockpit.md). ## Docker setup The Ubuntu `docker.io` package path is preferred. The Docker official repository is configured only when `docker.io` is unavailable. The daemon uses the `json-file` log driver with five 50 MB files per container. The toolkit configures log retention only. It does not prune data, deploy Compose applications, or configure an NVIDIA container runtime. See [Docker setup](docs/docker.md). ## libvirt/KVM setup The libvirt profile installs QEMU, OVMF, software TPM support, virt-install, virt-manager, bridge utilities, and libvirt clients and services. It enables `libvirtd` and prints existing guests and networks. See [libvirt/KVM setup](docs/libvirt.md). ## NVIDIA tooling The default NVIDIA profile installs `nvtop`, `clinfo`, and PCI diagnostics. It reports detected NVIDIA devices, `nvidia-smi`, and DKMS state when those commands exist. Driver installation requires a numeric version that maps to an available Ubuntu package, for example `nvidia-driver-550`. Secure Boot enrollment, driver suitability, CUDA, container runtime support, and passthrough remain operator decisions. See [NVIDIA tooling](docs/nvidia.md). ## Tuning The tuning profile bounds persistent journal use, raises inotify limits for development and container workloads, reduces swappiness, enables sysstat, and runs automatic sensor detection when available. Review these values against available memory, storage, monitoring retention, and workload behavior before deployment beyond a lab. ## Security baseline The security profile installs UFW and fail2ban and enables fail2ban. It leaves UFW disabled unless `--enable-ufw` is present. Explicit UFW enablement permits OpenSSH and TCP port 9090 before activation. This is a minimal access-preservation baseline, not a complete host firewall or hardening standard. Application and VM networking may require additional reviewed rules. ## Postcheck The final script reports: - Failed systemd units. - Cockpit, Docker, libvirt, and fail2ban status when installed. - Running Docker containers and defined virtual machines. - NVIDIA runtime state. - Filesystem usage and listening ports. Warnings require operator review but optional component absence does not cause the postcheck itself to fail. ## Troubleshooting Run individual read-only checks after correcting a failed profile: ```bash sudo ./scripts/00-preflight.sh sudo ./scripts/99-postcheck.sh systemctl --failed journalctl -u docker -u libvirtd -u cockpit.socket -u fail2ban ``` Common failure areas are unavailable APT repositories, unsupported package names on a future Ubuntu release, invalid pre-existing Docker JSON, Secure Boot module signing, disabled CPU virtualization, and remote firewall assumptions. To roll back a managed configuration, compare the current file with its timestamped `.bak` copy, restore the reviewed backup, and restart or reload the owning service. Package removal is intentionally not automated because it may affect workloads and dependencies. ## Interview talking points - Why day-0 bootstrap and day-2 maintenance have separate ownership. - How explicit flags protect firewall and GPU driver decisions. - Why Docker JSON is validated, backed up, and merged. - How idempotent content checks prevent backup and restart churn. - Why preflight and postcheck evidence surround mutating profiles. - Which virtualization, Secure Boot, IOMMU, and GPU decisions remain manual. ## Future improvements - Add automated tests using disposable Ubuntu VMs. - Add a documented NVIDIA Container Toolkit profile. - Add optional non-root administrative user and group membership management. - Add bridge and VFIO planning checks without applying passthrough changes. - Add package compatibility matrices after validating future Ubuntu releases. - Export postcheck results in a structured format for evidence collection.