Pick Your CUDA Device at Program Runtime

Select your CUDA Device§

While conducting research on my thesis, there were several times where I wanted to test the efforts of my work on different CUDA cards. Luckily, I had access to a machine that contains 4 different CUDA cards, ranging from a GTX 480, to a Tesla C2050, to a Kepler K20Xm (there are two of these). This is all according to nvidia-smi of course.

Now, all of the kernels that I have been writing recently have been only designed to run on a single device (this could very likely change in the future, but not now). Normally, when you want to run a CUDA kernel on a different device, you select the device number programmatically using the cudaSetDevice() API call.

Of course, you might be thingking : "But I have a lot of code! I don't want to make a branch in my git repo, edit all of my source files, recompile, and then run from there!" At least, you should be thinking that...

Luckily, there is a solution, and it is one that let's you select the devices you want your CUDA kernels to run on whenever you execute your program. Deep in the CUDA Best Practices guide is a little section on the CUDA_VISIBLE_DEVICES environment variable.

By simply setting the CUDA_VISIBLE_DEVICES variable to your desired device identifier(s), you can change which device your program runs on. This is great for testing kernels across a number of devices rather quickly, without having to recompile every time.

Here is an example:

$ CUDA_VISIBLE_DEVICES=1 ./myCUDAProgram

If your kernels DO support multiple devices, you can provide a list like so:

$ CUDA_VISIBLE_DEVICES=1,3 ./myCUDAProgram

Not too shabby, eh? Now use this in your CUDA testing! You can thank me later...

Source: CUDA Best Practices Guide