EEVEE-Next: Reduce longer compilation time #120100

New Issue

Clément Foucault · 2024-03-31T00:34:24+01:00

Clément Foucault commented

2024-03-31 00:34:24 +01:00

It seems that nvidia drivers have a harder time with our new shaders for some reason (~3x from what I read).
First cold start of EEVEE-Next also takes quite longer than I would like to (a few seconds without any feedback), we could display a black frame with a waiting message instead, but ideally it should be less than 3 seconds.

So I think we have no choice but to optimize it at least a little for the first release.

The first ideas that comes to mind is to look at which part of the code is causing the most slowdown on the affected drivers and fix it.
My guess is that it is likely to be caused by aggressive loop unrolling like it was on Metal + M1 before the recent fix. However, it seems that working around it is quite tricky as there is no clear preprocessor directives for loop unrolling in GLSL and the extension that adds it looks quite unsupported.

The other approach would just be to reduce code size as much as possible. We could try to preprocess the GLSL string using our own obfuscator at compile time, but that looks unrealistic and the benefits are not quite clear.

Instead we should leverage SpirV. This can give several solutions:

Feed SpirV directly driver. Most of the hardware we target support the ARB_gl_spirv extension. This avoid the driver to do the parsing and most of the conversion.
Compile to SpirV (shaderc) then convert back to GLSL (spirvcross) to feed to the driver without deadcode, comments, and already optimized GLSL. This can work on older implementation that do not support the ARB_gl_spirv extension.
Multithread compilation. If we use shaderc (and optionally spirvcross) to compile the shader, then it becomes easier to precompile the shaders in many threads without needing GL contexts.

The GLSL interface might need a bit of tweaking to be able to be injected to shaderc but I am convinced this is worth the cost.

Note that all of these options should be profiled beforehand on a set of typical EEVEE-Next shaders to check what is the best way forward.

Note that this task is not proposing to ship precompiled SpirV shader sources.

The Vulkan backend would give us all these, but the timeline for it to become default is not aligning with the initial release nor the second release of EEVEE-Next.

Multithreaded compilation in OpenGL

We have to change our compilation model to accommodate for that.
The goal is to use the parallel shader compile extension. This doesn't need a different context for it to work. But we need to rework the interface with the GPU module for that to work.

https://forums.developer.nvidia.com/t/bugs-with-gl-arb-parallel-shader-compile/43715/8
https://www.reddit.com/r/opengl/comments/121j3q1/seeking_clarifications_on_multithreaded_shader/

It seems that nvidia drivers have a harder time with our new shaders for some reason (~3x from what I read). First cold start of EEVEE-Next also takes quite longer than I would like to (a few seconds without any feedback), we could display a black frame with a waiting message instead, but ideally it should be less than 3 seconds. So I think we have no choice but to optimize it at least a little for the first release. The first ideas that comes to mind is to look at which part of the code is causing the most slowdown on the affected drivers and fix it. My guess is that it is likely to be caused by aggressive loop unrolling like it was on Metal + M1 before the recent fix. However, it seems that working around it is quite tricky as there is no clear preprocessor directives for loop unrolling in GLSL and the extension that adds it looks quite unsupported. The other approach would just be to reduce code size as much as possible. We could try to preprocess the GLSL string using our own obfuscator at compile time, but that looks unrealistic and the benefits are not quite clear. Instead we should leverage SpirV. This can give several solutions: - Feed SpirV directly driver. Most of the hardware we target support the `ARB_gl_spirv` extension. This avoid the driver to do the parsing and most of the conversion. - Compile to SpirV (`shaderc`) then convert back to GLSL (`spirvcross`) to feed to the driver without deadcode, comments, and already optimized GLSL. This can work on older implementation that do not support the `ARB_gl_spirv` extension. - Multithread compilation. If we use `shaderc` (and optionally `spirvcross`) to compile the shader, then it becomes easier to precompile the shaders in many threads without needing GL contexts. The GLSL interface might need a bit of tweaking to be able to be injected to `shaderc` but I am convinced this is worth the cost. Note that all of these options should be profiled beforehand on a set of typical EEVEE-Next shaders to check what is the best way forward. Note that this task is not proposing to ship precompiled SpirV shader sources. The Vulkan backend would give us all these, but the timeline for it to become default is not aligning with the initial release nor the second release of EEVEE-Next. #### Multithreaded compilation in OpenGL We have to change our compilation model to accommodate for that. The goal is to use the parallel shader compile extension. This doesn't need a different context for it to work. But we need to rework the interface with the GPU module for that to work. https://forums.developer.nvidia.com/t/bugs-with-gl-arb-parallel-shader-compile/43715/8 https://www.reddit.com/r/opengl/comments/121j3q1/seeking_clarifications_on_multithreaded_shader/

👍 7 ❤️ 5 🎉 4 🚀 5

Clément Foucault added this to the 4.2 LTS milestone 2024-03-31 00:34:24 +01:00

Clément Foucault added the

Shader	Before	After
eevee_deferred_light_double	1.39s	2.53s
eevee_deferred_light_single	1.37s	2.47s

Shader	Before	After
eevee_deferred_light_double	0.71s	0.87s
eevee_deferred_light_single	0.69s	0.86s


eevee_deferred_light_double	5.80s
eevee_deferred_light_single	4.96s

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

EEVEE-Next: Reduce longer compilation time #120100

Multithreaded compilation in OpenGL

Find the cause of the compile times slow-downs

2-step compilation

SPIRV

Multithreaded Compilation

Skip unnecessary material passes