Steps:

  1. Get an H100 on brev. has to be the fluidstack one that you can reboot. yes, it only has 100 GB of disk space.
  2. get your keys
wget <https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb>
sudo dpkg -i cuda-keyring_1.1-1_all.deb
  1. just blow up their drivers and install better ones
sudo apt-get purge 'nvidia-.*'
sudo apt-get install cuda-drivers-550 nvidia-container-toolkit -y
sudo reboot
  1. configure docker
sudo nvidia-ctk runtime configure --runtime=docker
sudo rm /etc/cdi/nvidia.yaml 
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
sudo systemctl restart docker
  1. You may be good to go!
  2. But, if you get errors like RuntimeError: cuDNN Frontend error: [cudnn_frontend] Error: No execution plans support the graph. or Could not load library [libcuda.so](<http://libcuda.so/>). Error: [libcuda.so](<http://libcuda.so/>): cannot open shared object file: No such file or directory, then you need to link libcuda.so within your docker container. (I’ve seen this sometimes but not on every machine) If so, you’ll have to cog -p 5000 run bash or docker run ... your container and then
ln -s /usr/lib/x86_64-linux-gnu/libcuda.so.1 /usr/lib/x86_64-linux-gnu/libcuda.so
python -m cog.server.http (or whatever command you actually want to run)

liveblog, preserved for posterity: