Technically, the device side code is in CUDA C, which is really C++, but no one has written a C compiler that can compile a lightly extended version of ANSI C to PTX, so the device code must be C style CUDA C there, CUDA FORTRAN or hand written PTX assembly. I went with what was easiest, which was C style CUDA C.
https://hackertimes.com/item?id=42432802
Technically, the device side code is in CUDA C, which is really C++, but no one has written a C compiler that can compile a lightly extended version of ANSI C to PTX, so the device code must be C style CUDA C there, CUDA FORTRAN or hand written PTX assembly. I went with what was easiest, which was C style CUDA C.