CUDA for Wave Simulation - Part 3: Efficient CUDA
Non-trivial CUDA
The current snippet of code calling CUDA is:
cuda_step<<< 1, 1 >>>
This uses only one CUDA thread, and is probably extremely inefficient. To verify this, let’s add some way to time the program. I could use NVIDIA’s nsys, or just the...
pbhnblog.ballif.eu8 min read