Now hashed alpha materials are stable when moving the camera/not using TAA.
It also converge to a noise free image when using TAA. No more numerical imprecision.
There still can be situations with multiple overlapping transparent surfaces that can lead to residual noise.
Using GL_RG16I texture for the hit coordinates increase tremendously the precision of the hit.
The sign of the integer is used to 2 flags (has_hit and is_planar).
We do not store the depth and retrieve it from the depth buffer (increasing bandwith by +8bit/px).
The PDF is stored into another GL_R16F texture.
We remove the raycount for simplicity and to reduce compilation time (less branching in refraction shader).
Instead of creating non temp textures only at framebuffer creation, we create them and bind them if their pointer is NULL.
This should simplify the framebuffers creation code.
The reason for the crash is still a bit confusing, but on Windows with Intel HD Graphics 4000 it always happens when you enable `Use Nodes` or when you try to connect the Pricipled Shader node to the output without the `Subsurface Scattering` and `Subsurface Translucency` options enabled.
Was due to the fact that the instances don't have a "static" obmat that can be referenced to use as a uniform.
Solution : precompute the full matrix for each bone and pass it as instance data. (theses are copied into a buffer and can be discarded right away)
Note: this could be optimized further and make only one drawcall (shgroup) to draw all bone instance of one type (vs. one call per armature).
Tests on my system with ~1200 objects with 128 shadow casting lamps (current max) show a significant perf improvment (cache timing : 22ms -> 9ms)
With a baseline with no shadow casting light at 6ms this give a reduction of the overhead from 16ms to 3ms.
This remove pretty much all allocations during the cache phase. Leading to a big improvement for scene with a large number of lights & shadowcasters.
The lamps storage has been replace by a union to remove the need to free/allocate everyframe (also reducing memory fragmentation).
We replaced the linked list system used to track shadow casters by a huge bitflag.
We gather the lights shadows bounds as well as the shadow casters AABB during the cache populate phase and put them in big arrays cache friendly.
Then in the cache finish phase, it's easier to iterate over the lamps shadow SphereBounds and test for intersection.
We use a double buffer system for the shadow casters arrays to detect deleted shadow casters.
Unfortunatly, it seems that deleting an object trigger an update for all other objects (thus tagging most shadow casting lamps to update), defeating the purpose of this tracking.
This needs further investigation.
This modify the selection code quite a bit but it's for the better.
When using selection we use the same batching / instancing process but we draw each element at a time using a an offset to the first element we want to draw and by drawing only one element.
This result much less memory allocation and better draw time.
This is a special memory manager that keeps memory blocks ready to send as vbo data.
Since we loose which memory block was used each DRWShadingGroup we need to redistribute them in the same order/size to avoid to realloc each frame.
This is why DRWInstanceDatas are sorted in a list for each different data size.
Result is less noisy ogl renders.
What this patch does:
- the draw loops gets accumulated into the output buffer.
- disable TXAA persmat jittering in ogl render since ogl render already does that.
- make noise texture update correct accross all draw loops. Previously it was reset between each FSAA samples.