This replace the previous square rings approach by sampling a disk the
footprint of the search area. This avoids sampling in areas in corners
where there isn't any weight.
This results in much less samples needed to acheive a good enough result.
The max number of samples for an area of 11x11 px is hard coded to 16 and
still gives good results with the final clamp.
The number of samples is adaptative and is scaled by the search area (max
CoC).
The High Quality Slight Defocus is not required anymore. If there is a
quality parameter to add, it would be sample count option. But I consider
the temporal stability enough for viewport work and render can still
render many full scene samples. So I don't see a need for that yet.
This adds anti-flicker pass to the slight focus region by using the
temporaly stable output from stabilize pass.
This also fixes the bilateral weight factor which was reversed.
This was caused by a missing synchronization.
The background gather pass was writting to the same occlusion texture
before the end of the scatter draw.
This implement a full TAA pass on the depth of field input.
An history buffer is kept for each view needing Depth of field.
This uses a swap with a `TextureFromPool` in order to not always 2
textures allocated.
Since this uses luma weighting without any input, the firefly parameter is
now obsolete and has been removed.
There is some tiny difference with the Film TAA so the implementation is
mostly copy pasted.
Also this implementation uses a LDS cache to speedup the TAA computations.
This moves the slight focus max in tile from the setup pass to the
resolve pass. This reduces complexity as there is no need for an extra
component in the tile textures.
This also avoids skipping any pixels and makes sure the local max matches
the dispatched local group size. This should make the resolve pass a little
bit faster.
This is a port of the previous implementation but using compute
shaders instead of using the raster pipeline for every steps.
Only the scatter passes is kept as a raster pass for obvious performance
reasons.
Many steps have been rewritten to take advantage of LDS which allows faster
and simpler downsampling and filtering for some passes.
A new stabilize phase has been separated from another setup pass in order
to improve it in the future with better stabilization.
The scatter pass shaders and pipeline also changed. We now use indirect
drawcall to draw quads using triangle strips primitives. This reduces
fragment shader invocation count & overdraw compared to a bounding
triangle. This also reduces the amount of vertex shader invocation
drastically to the bare minimum instead of having always 3 verts per
4 pixels (for each ground).
Replaces `DRW_shgroup_call_procedural_triangles_indirect`.
This makes the indirect drawing more flexible.
Not all primitive types are supported but it is just a matter of adding
them.
This loosens the current implementation a bit to only force optimal
display when editing on cage. It used to be any editing mode.
Brings GPU based subdivision closer to the CPU version.
Although eevee-next is disabled in Blender 3.3 there is an error that is
visible when compiling shaders using the shader builder.
This is because of an error in a preprocessing directive (defined should
be define).
Adds `rna_path.cc` and `RNA_path.h`.
`rna_access.c` is a quite big file, which makes it rather hard and
inconvenient to navigate. RNA path functions form a nicely coherent unit
that can stand well on it's own, so it makes sense to split them off to
mitigate the problem. Moreover, I was looking into refactoring the quite
convoluted/overloaded `rna_path_parse()`, and found that some C++
features may help greatly with that. So having that code compile in C++
would be helpful to attempt that.
Differential Revision: https://developer.blender.org/D15540
Reviewed by: Brecht Van Lommel, Campbell Barton, Bastien Montagne
This reverts commit e2c02655c7. It was already
reverted in the 3.2 branch, as it caused more serious issues than it solved.
Fixes T99805, T99323, T99296.
The new implementation leverage compute shaders to reduce the
number of passes and complexity.
The max blur amount is now detected automatically, replacing the property
in the render panel by a simple checkbox.
The dilation algorithm has also been rewritten from scratch into a 1 pass
algorithm that does the dilation more efficiently and more precisely.
Some differences with the old implementation can be observed in areas with
complex motion.
This removes the quirk of having to call the sync function for each new
render loop.
# Conflicts:
# source/blender/draw/engines/eevee_next/eevee_view.cc
Previously there was a special extraction process for "vertex colors"
that copied the color data to the GPU with a special format. Instead,
this patch replaces this with use of the generic attribute extraction.
This reduces the number of code paths, allowing easier optimization
in the future.
To make it possible to use the generic extraction system for attributes
but also assign aliases for use by shaders, some changes are necessary.
First, the GPU material attribute can now store whether it actually refers
to the default color attribute, rather than a specific name. This replaces
the hack to use `CD_MCOL` in the color attribute shader node. Second,
the extraction code checks the names against the default and active
names and assigns aliases if the request corresponds to a special active
attribute. Finally, support for byte color attributes was added to the
generic attribute extraction.
Differential Revision: https://developer.blender.org/D15205
The display depth is used to composite Gpencil and Overlays. For it to
be stable we bias it using the dFdx gradient functions. This makes
overlays like edit mode not flicker.
The previous approach to save the 1st center sample does not work anymore
since we jitter the projection matrix in a looping pattern when scene
is updated. So the center depth is only (almost) valid 1/8th of the times.
The biasing technique, even if not perfect, does the job of being stable.
This has a few cons:
- it makes the geometry below the ground plane unlike workbench engine.
- it makes overlays render over geometry at larger depth discontinuities.
This might make the image a bit blurier but it reduces the flickering of
shiny surfaces during animation.
This uses the technique described in "High Quality Temporal Supersampling"
by Brian Karis at Siggraph 2014 (Slide 45): Reduce the exponential factor
when the history is close the bounding box border.
A few offsets were missing.
Reminder that this does not change the actual render resolution but it
reduces the VRAM consumption of accumulation buffers.