Previous approach was not clear enough and caused problems.
UBOs were taking slots and not release them after a shading group even
if this UBO was only for this Shading Group (notably the nodetree ubo,
since we now share the same GPUShader for identical trees).
So I choose to have a better defined approach:
- Standard texture and ubo calls are assured to be valid for the shgrp
they are called from.
- (new) Persistent texture and ubo calls are assured to be valid accross
shgrps unless the shader changes.
The standards calls are still valids for the next shgrp but are not assured
to be so if this new shgrp binds a new texture.
This enables some optimisations by not adding redundant texture and ubo
binds.
This is mostly to avoid re-compilation when using undo/redo operators.
This also has the benefit to reuse the same GPUShader for multiple materials using the same nodetree configuration.
The cache stores GPUPasses that already contains the shader code and a hash to test for matches.
We use refcounts to know when a GPUPass is not used anymore.
I had to move the GPUInput list from GPUPass to GPUMaterial because it's containing references to the material nodetree and cannot be reused.
A garbage collection is hardcoded to run every 60 seconds to free every unused GPUPass.
This leads to less lookups to the GWNShaderInterface and less uniform upload.
We still keep a legacy path so that Builtin uniforms can still work. We might restrict this path to Builtin shader only in the future.
Instead of creating a new instancing shading group without attrib, we now have instancing calls. The benefits is that they can be culled.
They can be used in conjuction with the standard and generate calls but shader must support it (which is generally not the case).
We store a pointer to the actual count so that the number can be tweaked between redraw.
This will makes multi layer rendering more efficient.
This is very efficient and add a pretty low overhead (0.1ms of drawing time for 10K objects passing through all tests, on my i3-4100M).
The like the rest of the DRWCallState, test is "cached" until the view matrices changes.
Refactor include:
- Removal of DRWInterface. (was useless)
- Split DRWCallHeader into a new struct DRWCallState that will be reused in the future.
- Use BLI_link_utils for APPEND/PREPEND.
- Creation of the new DRWManager struct type. This will enable us to create more than one manager in the future.
- Removal of some dead code.