3.1 KiB
3.1 KiB
Lab Cheatsheet
Quick-reference notes for experiments, rebuilds, and short-lived troubleshooting. Expect rough edges. Capture what worked, what broke, and what should not be repeated in production.
K3s Lab
sudo systemctl status k3s --no-pager
sudo journalctl -u k3s -n 100 --no-pager
kubectl get nodes -o wide
kubectl get pods -A
kubectl get events -A --sort-by=.lastTimestamp | tail -30
sudo k3s kubectl get pods -A
Quick reset:
sudo /usr/local/bin/k3s-uninstall.sh # destructive lab reset
Proxmox Lab
pvesh get /nodes
pvesh get /cluster/resources
qm list
qm config <vmid>
pct list
ha-manager status
Checks before changes:
zpool status
pvesm status
ip -br addr
GPU Passthrough
lspci -nn | grep -Ei 'vga|3d|nvidia'
nvidia-smi
dmesg -T | grep -Ei 'vfio|iommu|nvidia'
find /sys/kernel/iommu_groups/ -type l | sort
Good sanity check:
lsmod | grep -E 'vfio|kvm'
Terraform Experiments
terraform fmt -recursive
terraform init
terraform validate
terraform plan
terraform state list
Scratch workflow:
terraform plan -out=tfplan
terraform show tfplan
Networking Labs
ip -br addr
ip route
bridge link
ss -ltnp
tcpdump -ni any port 53
dig +short example.com
mtr -rwzc 10 1.1.1.1
Ansible Testing
ansible-inventory -i inventory/hosts.yml --graph
ansible-playbook -i inventory/hosts.yml playbook.yml --syntax-check
ansible-playbook -i inventory/hosts.yml playbook.yml --check --diff
ansible all -i inventory/hosts.yml -m ping
Docker Testing
docker ps -a
docker logs --tail 100 <container>
docker exec -it <container> sh
docker inspect <container> | jq '.[0].NetworkSettings'
docker system df
Useful Temporary Commands
watch -n2 'kubectl get pods -A'
watch -n2 'nvidia-smi'
watch -n2 'ip -br addr'
while true; do date -u; curl -fsS http://127.0.0.1:8080/health; sleep 2; done
Quick PoC Commands
python3 -m http.server 8080
openssl req -x509 -newkey rsa:2048 -nodes -days 3 -keyout key.pem -out cert.pem
curl -vk https://127.0.0.1:8443/
nc -lvkp 9000
Troubleshooting Notes
- If K3s pods fail after host reboot, check time sync before chasing cert or API errors.
- If PVCs stay pending in lab clusters, inspect the default storage class first.
- If Docker networking looks broken, compare bridge subnet overlaps with the host route table.
- If GPU pods see no devices, validate driver, toolkit, and device plugin in that order.
Useful One-liners
kubectl get pods -A -o wide | egrep 'CrashLoopBackOff|Error|Pending'
journalctl -p err -S today
find /var/log -type f -mtime -1 -ls | sort -k7,7n
ps -eo pid,%cpu,%mem,cmd --sort=-%cpu | head
grep -RniE 'error|failed|timeout' .
Things Worth Remembering
- Pre-checks still matter in labs. Capture state before trying the risky thing.
- Keep a copy of working configs before rapid iteration.
- Short-lived labs still produce useful evidence; save command output when a fix works.
- If a PoC needs repeated manual repair, turn the repair steps into a script or note.