53 lines
1.3 KiB
Markdown
53 lines
1.3 KiB
Markdown
|
|
# NVIDIA Tooling
|
||
|
|
|
||
|
|
## Diagnostic-only default
|
||
|
|
|
||
|
|
The normal NVIDIA profile installs `nvtop`, `clinfo`, and PCI utilities. It
|
||
|
|
does not install or select a driver:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
sudo ./install.sh --nvidia-tools
|
||
|
|
```
|
||
|
|
|
||
|
|
Review hardware and current module state:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
lspci -nn | grep -i nvidia
|
||
|
|
nvidia-smi
|
||
|
|
dkms status
|
||
|
|
mokutil --sb-state
|
||
|
|
```
|
||
|
|
|
||
|
|
## Explicit driver installation
|
||
|
|
|
||
|
|
Install only a reviewed Ubuntu driver package version:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
sudo ./install.sh --install-nvidia-driver 550
|
||
|
|
```
|
||
|
|
|
||
|
|
The numeric value maps directly to `nvidia-driver-VERSION`. The profile refuses
|
||
|
|
an unavailable package. Reboot after installation, then validate `nvidia-smi`,
|
||
|
|
kernel logs, DKMS state, and application behavior.
|
||
|
|
|
||
|
|
## Selection considerations
|
||
|
|
|
||
|
|
- GPU generation and supported driver branch.
|
||
|
|
- Ubuntu release, kernel, and HWE stack.
|
||
|
|
- Secure Boot module enrollment.
|
||
|
|
- CUDA or application compatibility.
|
||
|
|
- Docker NVIDIA Container Toolkit requirements.
|
||
|
|
- Whether the device will be bound to VFIO instead of the host driver.
|
||
|
|
|
||
|
|
## Troubleshooting
|
||
|
|
|
||
|
|
```bash
|
||
|
|
journalctl -k | grep -Ei 'nvidia|nouveau|NVRM'
|
||
|
|
lsmod | grep -E 'nvidia|nouveau'
|
||
|
|
dkms status
|
||
|
|
apt-cache policy 'nvidia-driver-*'
|
||
|
|
```
|
||
|
|
|
||
|
|
Driver rollback is environment-specific and is not automated. Preserve console
|
||
|
|
access and a known-good kernel before changing GPU or Secure Boot configuration.
|