GPU

This requires some background understand why it’s hard. We’ll first explain how it’s done with Docker if you’re not using Prefect.

The NVIDIA toolkit

Step one of getting GPUs working with Docker is using the nvidia-container-toolkit. Since Docker 19.03, you can attach install the nvidia-container-toolkit on your image using yum install or an equivalent dependending on your OS.

Then it’s just a matter of running the following command.

docker run --name my_first_gpu_container --gpus device=0 nvidia/cuda

Note that this is without Prefect.

Can I just pass the –gpus flag to DockerRun then?

Some users will request that Prefect take in the gpus flag when spinning up Docker to set the gpus on the container. This can’t be done because Prefect uses docker-py under the hood to start the container and this option is not supported in this package. See this issue

The Current Workaround

First, set the Nvidia runtime as the default in /etc/docker/daemon.json . You can simply set the env variable NVIDIA_VISIBLE_DEVICES in the environment that’s running the Docker agent. It might not be an acceptable solution for all since it modifies the default Docker behavior.

More information on editing the configuration can be found here and above.

Now the Docker Agent will do the following when launched with supervisord:

[program:prefect-agent-gpu2]
command=prefect agent docker start -l {OUR_LABELS} -f -n {OUR_NAME} -t ${PREFECT_TOKEN} -e CUDA_VISIBLE_DEVICES="0,1" -e NVIDIA_VISIBLE_DEVICES="6,7"
user=supervisor_user
environment=HOME="/home/supervisor_user",USER="supervisor_user",NVIDIA_VISIBLE_DEVICES="6,7"

The tricky part here is that this Docker Agent will keep spinning up multiple processes without really realizing that the resources aren’t sharable. You need a distinct label that has a flow concurrency cap of 1.

The main part of that is that NVIDIA_VISIBLE_DEVICES needs to be set in the Docker Agent environment. CUDA_VISIBLE_DEVICES will then start out indexed at 0.