Running ollama & open-webui on Nvidia AGX Orin

19 May, 2024

It took me about a day to find this, hopefully it comes up in your Google search.

I found that out of the box, ollama could not discover the GPU in the Nvidia AGX Orin. Through various tests I found that:

Using pytorch, the iGPU was not discovered as a CUDA device
```
import torch
torch.cuda.device_count() == 0
```

nvcc -V correctly reports CUDA drivers are installed

ross@ubuntu:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:08:11_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0

Running the nvidia/cuda container image, device_count is also 0

Ensure nvidia-container-toolkit is installed, and dockerd is configured with the nvidia runtime. Flashing the OS will do this.

Build a container image with pytorch from the cuda image:

# Dockerfile
FROM nvidia/cuda:12.2.2-runtime-ubuntu22.04
RUN apt-get update && apt-get install -y python3.10 python3-pip
RUN pip install torch

docker build -t torchtest . && \
docker run --rm --runtime=nvidia torchtest \
  python3 -c 'import torch; print(torch.cuda.device_count())'

I eventually stumbled on dusty-nv/jetson-containers. This repo contains a whole host of container images for various use-cases, that work on Jetson SoCs.

I believe this all boils down to Tegra (the SoC name) having specific CUDA drivers, and that pytorch comes bundled with a more generic library set. When you dig through the images, you can find the ollama image in jetson-containers depends on the cuda imag, which pulls drivers depending on the Jetpack version.

To wrap up, you can run ollama and open-webui using a compose config like this:

services:
  ollama:
    runtime: nvidia
    image: dustynv/ollama:r36.2.0
    command: ollama serve
    environment:
      OLLAMA_HOST: 0.0.0.0
    volumes:
      - ~/.ollama:/root/.ollama

  openwebui:
    image: ghcr.io/open-webui/open-webui:v0.1.124
    volumes:
      - openwebui:/app/backend/data
    depends_on:
      - ollama
    ports:
      - 0.0.0.0:8080:8080
    environment:
      OLLAMA_BASE_URL: http://ollama:11434
      WEBUI_SECRET_KEY: secret

volumes:
  openwebui: {}