This new implementation does all downsampling in a single compute shader
dispatch, removing a lot of complexity from the previous recursive
downsampling.
This is heavilly inspired by the Single-Pass-Downsampler from GPUOpen:
https://github.com/GPUOpen-Effects/FidelityFX-SPD
However I do not implement all the optimization bits as they require
vulkan (GL_KHR_shader_subgroup) and is not as versatile (it is only
for HiZ).
Timers inside renderdoc report ~0.4ms of saving on a 2048*1024 render for
the whole downsampling. Note that the previous implementation only
processed 6 mips where the new one processes 8 mips.
```
EEVEE ~1.0ms
EEVEE-Next ~0.6ms
```
Padding has been bumped to be of 128px for processing 8 mips.
A new debug option has been added (debug value 2) to validate the HiZ.