How to Setup Ubuntu Server on Proxmox for NVIDIA GPU AI Development

How to Setup Ubuntu Server on Proxmox for NVIDIA GPU AI Development

Your Proxmox host is configured. Your GPU is isolated via passthrough. Now comes the moment that transforms an expensive piece of hardware into a working AI accelerator: creating a lean, stable Ubuntu Server environment that will house your GPU and run your machine learning workloads.

This guide assumes you’ve completed Proxmox host configuration (including GPU passthrough setup from Day 2) and have an Ubuntu Server LTS ISO uploaded to your Proxmox storage. We are not installing a desktop GUI — GNOME or KDE would consume precious RAM that your LLMs need to run. Every megabyte counts when you’re running large language models or complex training jobs.

Prerequisites:

  • Proxmox host with GPU passthrough configured
  • Ubuntu Server LTS ISO available in Proxmox storage
  • VM resources planned (CPU cores, RAM allocation)
  • Network connectivity to your Proxmox host

Phase 1: The Proxmox VM Creation Wizard — Critical AI Settings

Navigate to your Proxmox console and click Create VM. Walk through each tab carefully — the settings here determine stability and GPU recognition.

General Tab

Enter a descriptive VM name (e.g., “ubuntu-ai-server”) and let Proxmox auto-assign a VM ID for reference.

OS Tab

Select Linux as the OS type, then choose your uploaded Ubuntu Server LTS ISO image. Leave “Type” as Linux — Proxmox will detect the distribution automatically.

System Tab — Crucial for Modern GPU Passthrough

This tab is non-negotiable for GPU workloads. Set BIOS to OVMF (UEFI). This is essential — modern NVIDIA GPUs require UEFI firmware to initialise properly. Don’t use SeaBIOS; it won’t work reliably with GPU passthrough.

Check the “Add EFI Disk” box and allocate 1 GB. Without this, the VM won’t boot with UEFI. Set Machine to q35 (Proxmox default) and SCSI Controller to VirtIO SCSI for optimised storage performance.

CPU Tab — AVX/AVX2 Instruction Sets Matter

Allocate 4–8 cores depending on your host CPU and workload (adjust based on your needs). The critical setting: set Type to “host”. This is not optional.

The “host” CPU type passes through your physical CPU’s instruction sets — including AVX and AVX2 — directly to the VM. These are essential for optimized AI library operations (PyTorch, TensorFlow, etc.). Without a “host” CPU type, your AI workloads will run significantly slower and miss critical performance optimisations.

Memory Tab — The Most Important Setting for Stability

Allocate a fixed amount of RAM (e.g., 32 GB). Adjust based on your host’s available RAM and planned workload size. This is where stability decisions are made.

UNCHECK “Ballooning Device”. This is non-negotiable. Memory ballooning allows Proxmox to dynamically shrink or expand a VM’s RAM on demand. During intense AI training or inference, dynamic memory resizing causes unpredictable slowdowns, cache misses, and complete crashes. AI workloads require stable, predictable memory allocation. Keep ballooning unchecked.

Network Tab

Leave as default. Proxmox will assign a virtual NIC, and DHCP will provide an IP.

Confirm and Create

Review all settings on the confirmation tab. Click Finish to create the VM.


Phase 2: OS Installation & First Boot

Click Start to power on the VM. Open the Console tab in Proxmox to watch the boot process. The Ubuntu installer will appear within seconds.

Follow the standard installation prompts: language, keyboard layout, network configuration, and storage setup. When you reach the Storage screen, accept the default — use the entire virtual disk that Proxmox created.

Enter a hostname for your VM (e.g., “ai-server”) and create a user account with a strong password. You’ll use this for everyday work.

Critical during installation: When the Software Selection screen appears, ensure the OpenSSH Server checkbox is CHECKED. This is vital. It allows you to connect to your VM remotely via SSH without relying on Proxmox’s console, which can be slow and cumbersome for daily development.

Let the installation complete and allow the VM to reboot automatically.


Phase 3: Post-Installation Housekeeping — The Must-Dos

Once the VM boots and displays the login prompt, log in with the user credentials you created.

Run the following commands immediately:

sudo apt update && sudo apt upgrade -y

This updates your package repository and installs any available security patches. The process takes 1–3 minutes, depending on system age.

Next, install three essential utilities

sudo apt install qemu-guest-agent neofetch htop -y

Here’s what each does:

qemu-guest-agent allows Proxmox to monitor your VM’s real-time IP address, CPU usage, and memory consumption. More importantly, it enables clean shutdown commands from Proxmox instead of hard resets. Unclean shutdowns can leave GPU memory in an unstable state, potentially corrupting active training jobs.

Neofetch displays system information — useful for quick verification of your Ubuntu version and specs.

htop is an interactive process monitor that shows CPU and memory usage in real time — invaluable for debugging resource contention during AI workloads.

After installation completes, reboot the VM:

sudo reboot

Once it boots, verify the QEMU Guest Agent is running:

sudo systemctl status qemu-guest-agent

You should see active (running). If not, restart it:

sudo systemctl restart qemu-guest-agent

Return to the Proxmox console and refresh the VM’s hardware view. You should now see the VM’s IP address displayed under “IPs” instead of “N/A.” This confirms the guest agent is communicating with Proxmox.


Phase 4: The Payoff — Installing NVIDIA Drivers

This is the moment of truth. If GPU passthrough was configured correctly on your Proxmox host, your GPU will be recognised here. If it fails, you’ll need to revisit your host’s IOMMU and blacklist configuration.

The easiest and most reliable method for Ubuntu Server is the built-in A ubuntu-drivers tool which automatically detects your GPU and installs the appropriate driver.

Run:

sudo ubuntu-drivers list

This outputs all available NVIDIA drivers for your hardware. You’ll see something like:

nvidia-driver-470
nvidia-driver-550
nvidia-driver-570
nvidia-driver-570-open

For modern AI workloads, prefer the open-source driver variants (those with the -open suffix). They avoid kernel module taint issues and provide excellent stability. However, if your GPU doesn't support the open-source driver, the proprietary versions work just as well.

Install the recommended driver automatically:

sudo ubuntu-drivers autoinstall

This detects your hardware and installs the best-matched stable driver. The process takes 2–5 minutes. You’ll see compiler messages as the kernel module is built and inserted into the running kernel.

Once complete, reboot the VM to load the new kernel module:

sudo reboot

After the reboot, run the legendary command:

nvidia-smi

If GPU passthrough was configured correctly, you will see output like.

+---------------------------------------------------------------------------------+
| NVIDIA-SMI 570.169 Driver Version: 570.169 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+-------------|
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
|===========================================================================================|
| 0 NVIDIA GeForce RTX 9090 Off | 00:10.0 Off | N/A |
| 0% 38C P8 15W / 575W | 2MiB / 32607MiB | 0% Default |
+-------------------------+------------------------+-------------|

Processes:
GPU GiB Type Process name GPU Memory
| GPU-Util Compute M. |
|===========================|===========================|
| No running processes found |
+---------------------------+-----------+

What to Look For

GPU Name: Displays your GPU model (RTX 5090, RTX 4090, A100, etc.). This confirms your GPU was passed through successfully.

Driver Version: Shows the installed driver version. If you see this, the driver installation worked.

CUDA Version: Displays the CUDA version supported by this driver (e.g., 12.8). This is important for AI frameworks — PyTorch and TensorFlow check CUDA compatibility.

Memory: Shows total GPU memory available (e.g., 32607 MiB = ~32 GB). This confirms the full VRAM is accessible to the VM.

If nvidia-smi fails

If you see “No devices were found” or a driver error, GPU passthrough isn’t working. Return to your Proxmox host and verify:

  • IOMMU is enabled in the host’s GRUB configuration
  • Your GPU is not in the blacklist (check /etc/modprobe.d/blacklist.conf)
  • The GPU is correctly assigned in the VM’s PCI devices tab
  • Your GPU model is supported by NVIDIA drivers (very old or very new GPUs sometimes have issues)

Optional Real-Time GPU Monitoring

For continuous GPU usage monitoring, install:

sudo apt install nvtop

Then run:

nvtop

This provides a top-like interface showing GPU memory, utilisation, and running processes in real time. It’s invaluable during development and debugging.


What You’ve Built

You now have a stable, GPU-accelerated Ubuntu Server ready for AI development. The system has fixed memory (no ballooning), proper CPU passthrough with AVX/AVX2 instruction sets, and verified GPU recognition. This is a rock-solid foundation for running LLMs, training models, or performing inference workloads.

Your GPU is no longer a dormant piece of hardware sitting in your Proxmox host. It’s alive, recognised, and ready to accelerate your machine learning work.


What’s Next

But a powerful engine without fuel is useless. In the next phase, you’ll install Ollama — the simplest way to download and run any open-source LLM on your system without wrestling with CUDA Toolkit installation, Python virtual environments, or complex configuration files.

With Ollama, you’ll go from nvidia-smi verification to running a 7B parameter Llama model in under 15 minutes. No dependency hell. No version conflicts. Just instant LLM inference.

The hard work is done. Now comes the fun part: actually using your AI setup.


1 Comments

Previous Post Next Post