GPU¶
This requires some background understand why it’s hard. We’ll first explain how it’s done with Docker if you’re not using Prefect.
The NVIDIA toolkit¶
Step one of getting GPUs working with Docker is using the nvidia-container-toolkit
.
Since Docker 19.03, you can attach install the nvidia-container-toolkit
on your image
using yum install
or an equivalent dependending on your OS.
Then it’s just a matter of running the following command.
docker run --name my_first_gpu_container --gpus device=0 nvidia/cuda
Note that this is without Prefect.
Can I just pass the –gpus flag to DockerRun then?¶
Some users will request that Prefect take in the gpus flag when spinning
up Docker to set the gpus on the container. This can’t be done because Prefect
uses docker-py
under the hood to start the container and this option is not
supported in this package. See this issue
The Current Workaround¶
First, set the Nvidia runtime as the default in /etc/docker/daemon.json
. You can simply set
the env variable NVIDIA_VISIBLE_DEVICES
in the environment that’s running the Docker agent.
It might not be an acceptable solution for all since it modifies the default Docker behavior.
More information on editing the configuration can be found here and above.
Now the Docker Agent will do the following when launched with supervisord
:
[program:prefect-agent-gpu2]
command=prefect agent docker start -l {OUR_LABELS} -f -n {OUR_NAME} -t ${PREFECT_TOKEN} -e CUDA_VISIBLE_DEVICES="0,1" -e NVIDIA_VISIBLE_DEVICES="6,7"
user=supervisor_user
environment=HOME="/home/supervisor_user",USER="supervisor_user",NVIDIA_VISIBLE_DEVICES="6,7"
The tricky part here is that this Docker Agent will keep spinning up multiple processes without really realizing that the resources aren’t sharable. You need a distinct label that has a flow concurrency cap of 1.
The main part of that is that NVIDIA_VISIBLE_DEVICES
needs to be set in the Docker Agent
environment. CUDA_VISIBLE_DEVICES
will then start out indexed at 0.