GitHub LinkedIn

Managing Multiple PyTorch CUDA Builds with uv

/ #pytorch#deep-learning#uv

Dependency hell problem

You finally get your model training to work on your local machine, only to push it to the GPU cluster and watch it immediately crash with a

RuntimeError: CUDA error: no kernel image is available for execution.

Your local machine is on CUDA 12.8; the server is stuck on 11.8. Now you’re staring at a pip install and requirements.txt that feels like a game of Russian Roulette.

Astral uv as a solution

UV Package Manager Logo
The blazingly fast UV package manager

Astral uv is an extremely fast Python package and project manager written in Rust. It has become the industry standard over the last few years for managing Python projects. It just works.

Managing torch with uv

Beyond uv’s general benefits, handling multiple PyTorch CUDA builds in a single project is non-trivial and not well documented.

After piecing together several sources, experimenting on my own, testing with multiple Docker containers, multiple GPUs (GTX 1080 Ti, RTX 3060…), and multiple compute capabilities (CC 6.1 and CC 8.6), I finally found a solution.

1. Set up your uv project

If you already have a working uv project, you can skip this step. Otherwise, you can start a basic project by running

uv init

This will initialize git, a venv, project structure, and pyproject.toml for you.

2. Which torch version do you need?

You’ll need to determine which CUDA versions you need to support. You can check the CUDA version of your local machine by running

nvcc --version

A great resource is the PyTorch compatibility matrix. Gather the CUDA versions you need to support and the corresponding torch versions.

3. Structure your dependencies

We will use the concept of an extra. Here’s a Stack Overflow discussion about Python’s extras.

In your pyproject.toml, you can define extras for each CUDA version. We’ll walk through an example that supports CUDA 12.6 and CUDA 11.8.

These are our extras and their dependencies. You can add them using the [project.optional-dependencies] section in pyproject.toml:

Another way of adding them is by using the uv add command with the --optional flag. Either way works.

[project.optional-dependencies]
cu126 = ["torch==2.7.1", "torchvision==0.22.1", "torchaudio==2.7.1"]
cu118 = ["torch==2.7.1", "torchvision==0.22.1", "torchaudio==2.7.1"]

Since these two extras are incompatible (although they use the same package version, the CUDA version is different), we need to declare them as conflicting:

[tool.uv]
conflicts = [[{ extra = "cu126" }, { extra = "cu118" }]]

After that, we need to define the indexes for our packages.

[[tool.uv.index]]
name = "pytorch-cu126"
url = "https://download.pytorch.org/whl/cu126"
explicit = true

[[tool.uv.index]]
name = "pytorch-cu118"
url = "https://download.pytorch.org/whl/cu118"
explicit = true

Finally, we need to map each package to the correct index based on which extra is being installed.

[tool.uv.sources]
torch = [
  { index = "pytorch-cu126", extra = "cu126" },
  { index = "pytorch-cu118", extra = "cu118" },
]
torchvision = [
  { index = "pytorch-cu126", extra = "cu126" },
  { index = "pytorch-cu118", extra = "cu118" },
]
torchaudio = [
  { index = "pytorch-cu126", extra = "cu126" },
  { index = "pytorch-cu118", extra = "cu118" },
]

4. Install in a new environment

After setting up your project, you can simply run

uv sync --extra cu126
# or
uv sync --extra cu118

Verifying the installation

To confirm everything is working correctly, you can run a quick check:

import torch

print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
print(f"GPU count: {torch.cuda.device_count()}")

If CUDA is available, you should see the correct version for your build (e.g., 12.6 for cu126 or 11.8 for cu118).

You’re done! Everything should work perfectly in your environment.

Try it yourself

The full working example with Docker containers, tests, and reproducible environments is available on GitHub:

👉 github.com/gabrielfruet/uv-torch-cuda

The repo includes Dockerfiles for both CUDA 11.8 and 12.6, along with tests to verify GPU tensor operations work correctly on different architectures.