>I would strongly advise against this. GPUs are highly efficient when neighboring threads within a warp access neighboring data and follow largely the same code path. Even across warps, data locality is highly desirable.
Its a bit like saying writing code at all is bad though. Divergence isn't desirable, but neither is running any code at all - sometimes you need it to solve a problem
Not supporting divergence at all is a huge mistake IMO. It isn't good, but sometimes its necessary
>Could you kindly share a source for this? Shader Execution Reordering (SER) is available for Ray tracing, but it is not a general-purpose feature that can be used in generic compute shaders.
My understanding is that this is fully transparent to the programmer, its just more advanced scheduling for threads. SER is something different entirely
Nvidia are a bit vague here, so you have to go digging into patents if you want more information on how it works
Its a bit like saying writing code at all is bad though. Divergence isn't desirable, but neither is running any code at all - sometimes you need it to solve a problem
Not supporting divergence at all is a huge mistake IMO. It isn't good, but sometimes its necessary
>Could you kindly share a source for this? Shader Execution Reordering (SER) is available for Ray tracing, but it is not a general-purpose feature that can be used in generic compute shaders.
https://docs.nvidia.com/cuda/cuda-programming-guide/03-advan...
My understanding is that this is fully transparent to the programmer, its just more advanced scheduling for threads. SER is something different entirely
Nvidia are a bit vague here, so you have to go digging into patents if you want more information on how it works