10 KiB
Linux Fresh Setup Toolkit
Executive summary
The Linux Fresh Setup Toolkit is day-0 bootstrap automation for a clean Ubuntu lab server or workstation. It prepares a host for routine administration, Cockpit, Docker workloads, libvirt/KVM virtual machines, optional NVIDIA diagnostics, bounded logging, practical kernel tuning, and a conservative security baseline.
The scripts are modular and safe to rerun. Optional components remain optional, UFW is not enabled without a specific flag, and an NVIDIA driver is never installed without an explicit version. This is a portfolio and homelab implementation, not a production-certified build standard.
Scope and non-goals
The toolkit supports Ubuntu 24.04 and newer and assumes a systemd-based host
with APT package management. It is suitable for a host such as ailab that may
run WebODM, Open WebUI, Homepage, NVIDIA workloads, or test virtual machines.
It does not:
- Deploy applications, containers, or virtual machines.
- Configure GPU passthrough, VFIO bindings, bridges, or Windows guests.
- Select an NVIDIA driver automatically.
- Define a complete firewall policy or compliance baseline.
- Replace backup, monitoring, patching, or ongoing maintenance processes.
- Claim live validation against every future Ubuntu release.
Why this is separate from ailab-maintenance
This project establishes a fresh host. The sibling AI Lab Maintenance Toolkit handles day-2 health checks, scheduled cleanup, configuration backup, disk monitoring, and VM inventory after a host is operating.
Keeping bootstrap and maintenance separate makes the change boundary clear: this toolkit installs platform capabilities and baseline configuration, while the maintenance toolkit manages recurring operational tasks.
Directory layout
setup/
├── README.md
├── install.sh
├── scripts/
│ ├── 00-preflight.sh
│ ├── 00-platform-guard.inc
│ ├── 01-base-packages.sh
│ ├── 02-shell-profile.sh
│ ├── 03-cockpit.sh
│ ├── 04-docker.sh
│ ├── 05-libvirt.sh
│ ├── 06-nvidia-tools.sh
│ ├── 07-tuning.sh
│ ├── 08-security-baseline.sh
│ └── 99-postcheck.sh
├── files/
│ ├── bashrc.d/ailab.sh
│ ├── docker/daemon.json
│ ├── sysctl/99-ailab.conf
│ └── systemd/journald-ailab-limits.conf
└── docs/
├── fresh-install-checklist.md
├── cockpit.md
├── docker.md
├── libvirt.md
├── nvidia.md
└── bash-shell.md
00-platform-guard.inc is an internal sourced helper used by mutating
component scripts; it is not an executable profile.
Supported profiles and flags
| Flag | Result |
|---|---|
--base |
Install operational CLI, diagnostic, storage, and network packages |
--shell |
Install the root AI lab Bash profile |
--cockpit |
Install and enable Cockpit |
--docker |
Install Docker and bounded JSON-file logging |
--libvirt |
Install and enable libvirt/KVM |
--nvidia-tools |
Install NVIDIA and OpenCL diagnostics without a driver |
--install-nvidia-driver VERSION |
Install diagnostics and the named Ubuntu driver package |
--tuning |
Apply journald, sysctl, sensor, and sysstat settings |
--security |
Install and enable fail2ban; install but do not enable UFW |
--enable-ufw |
Run security setup and explicitly enable UFW |
--all |
Run every standard profile without UFW enablement or driver installation |
--install-nvidia-driver implies --nvidia-tools. --enable-ufw implies
--security. With no flags, the installer prints help and makes no changes.
Installation examples
Review the scripts and current host access path before execution:
cd labs/linux/setup
./install.sh
sudo ./install.sh --base --shell
sudo ./install.sh --cockpit --docker --libvirt
sudo ./install.sh --all
Explicit high-impact options can be combined with --all:
sudo ./install.sh --all --enable-ufw
sudo ./install.sh --all --install-nvidia-driver 550
The installer runs the read-only preflight once before selected profiles and a postcheck after all successful profile steps.
Fresh host workflow
- Patch the base Ubuntu installation and confirm console or out-of-band access.
- Review the fresh install checklist.
- Run
sudo ./install.sh --base --shell. - Add only the platform profiles needed by the host.
- Review service state, listening ports, storage, networking, and warnings in the postcheck.
- Reboot if a driver or kernel-related package requires it.
- Capture host-specific configuration and backup requirements separately.
AI lab workflow
A general AI lab host can start with:
sudo ./install.sh --base --shell --cockpit --docker --nvidia-tools --tuning --security
This installs GPU diagnostics but leaves driver choice to the operator. Add libvirt only when the host will run VMs. Enable UFW only after confirming SSH, Cockpit, application, bridge, and VM networking requirements.
Safety model
- Mutating profiles require root and refuse non-Ubuntu systems or Ubuntu older than 24.04.
- Component profiles install their own direct prerequisites.
- Existing managed configuration is changed only when content differs.
- Changed root shell, Docker, journald, and sysctl files receive timestamped backups.
- Existing valid Docker JSON is merged so unrelated settings survive.
- Invalid Docker JSON stops configuration rather than being overwritten.
- UFW and NVIDIA driver installation require explicit flags.
- Package and service failures are not hidden.
- Postcheck warnings report optional or inactive components without masking a successfully completed diagnostic script.
APT installation and service restarts are real system changes. Test first on a disposable host and maintain a console path when changing remote access policy.
Bash shell profile
The shell profile is installed as /root/.bashrc.d/ailab.sh, and one exact
source line is maintained in /root/.bashrc. It adds concise helpers for
systemd, journals, Docker, libvirt, NVIDIA, ports, archives, and disk usage.
See Bash shell profile for command details and cautions.
Cockpit setup
Cockpit provides browser-based host, storage, network, package, VM, metrics,
and support-report views. The installer enables cockpit.socket and reports
https://HOSTNAME:9090. cockpit-files is optional because it is not
available in every enabled Ubuntu repository.
See Cockpit setup.
Docker setup
The Ubuntu docker.io package path is preferred. The Docker official
repository is configured only when docker.io is unavailable. The daemon uses
the json-file log driver with five 50 MB files per container.
The toolkit configures log retention only. It does not prune data, deploy Compose applications, or configure an NVIDIA container runtime.
See Docker setup.
libvirt/KVM setup
The libvirt profile installs QEMU, OVMF, software TPM support, virt-install,
virt-manager, bridge utilities, and libvirt clients and services. It enables
libvirtd and prints existing guests and networks.
See libvirt/KVM setup.
NVIDIA tooling
The default NVIDIA profile installs nvtop, clinfo, and PCI diagnostics.
It reports detected NVIDIA devices, nvidia-smi, and DKMS state when those
commands exist.
Driver installation requires a numeric version that maps to an available
Ubuntu package, for example nvidia-driver-550. Secure Boot enrollment,
driver suitability, CUDA, container runtime support, and passthrough remain
operator decisions.
See NVIDIA tooling.
Tuning
The tuning profile bounds persistent journal use, raises inotify limits for development and container workloads, reduces swappiness, enables sysstat, and runs automatic sensor detection when available.
Review these values against available memory, storage, monitoring retention, and workload behavior before deployment beyond a lab.
Security baseline
The security profile installs UFW and fail2ban and enables fail2ban. It leaves
UFW disabled unless --enable-ufw is present. Explicit UFW enablement permits
OpenSSH and TCP port 9090 before activation.
This is a minimal access-preservation baseline, not a complete host firewall or hardening standard. Application and VM networking may require additional reviewed rules.
Postcheck
The final script reports:
- Failed systemd units.
- Cockpit, Docker, libvirt, and fail2ban status when installed.
- Running Docker containers and defined virtual machines.
- NVIDIA runtime state.
- Filesystem usage and listening ports.
Warnings require operator review but optional component absence does not cause the postcheck itself to fail.
Troubleshooting
Run individual read-only checks after correcting a failed profile:
sudo ./scripts/00-preflight.sh
sudo ./scripts/99-postcheck.sh
systemctl --failed
journalctl -u docker -u libvirtd -u cockpit.socket -u fail2ban
Common failure areas are unavailable APT repositories, unsupported package names on a future Ubuntu release, invalid pre-existing Docker JSON, Secure Boot module signing, disabled CPU virtualization, and remote firewall assumptions.
To roll back a managed configuration, compare the current file with its
timestamped .bak copy, restore the reviewed backup, and restart or reload the
owning service. Package removal is intentionally not automated because it may
affect workloads and dependencies.
Interview talking points
- Why day-0 bootstrap and day-2 maintenance have separate ownership.
- How explicit flags protect firewall and GPU driver decisions.
- Why Docker JSON is validated, backed up, and merged.
- How idempotent content checks prevent backup and restart churn.
- Why preflight and postcheck evidence surround mutating profiles.
- Which virtualization, Secure Boot, IOMMU, and GPU decisions remain manual.
Future improvements
- Add automated tests using disposable Ubuntu VMs.
- Add a documented NVIDIA Container Toolkit profile.
- Add optional non-root administrative user and group membership management.
- Add bridge and VFIO planning checks without applying passthrough changes.
- Add package compatibility matrices after validating future Ubuntu releases.
- Export postcheck results in a structured format for evidence collection.