Commit Graph

17 Commits

Author SHA1 Message Date
633117669b Realtime Compositor: Implement dilate erode node
This patch implements the dilate/erode node for the realtime compositor.

Differential Revision: https://developer.blender.org/D15790

Reviewed By: Clement Foucault
2022-09-02 14:47:39 +02:00
b43b62191c EEVEE-Next: HiZ Buffer: New implementation
This new implementation does all downsampling in a single compute shader
dispatch, removing a lot of complexity from the previous recursive
downsampling.

This is heavilly inspired by the Single-Pass-Downsampler from GPUOpen:
https://github.com/GPUOpen-Effects/FidelityFX-SPD
However I do not implement all the optimization bits as they require
vulkan (GL_KHR_shader_subgroup) and is not as versatile (it is only
for HiZ).

Timers inside renderdoc report ~0.4ms of saving on a 2048*1024 render for
the whole downsampling. Note that the previous implementation only
processed 6 mips where the new one processes 8 mips.
```
EEVEE ~1.0ms
EEVEE-Next ~0.6ms
```

Padding has been bumped to be of 128px for processing 8 mips.

A new debug option has been added (debug value 2) to validate the HiZ.
2022-08-15 18:36:19 +02:00
35762cea91 DRW: common_math_lib.glsl: Fix weighted_sum macro
This avoids issue when the macro is followed by another operator.
Example:
`float result = weighted_sum(a,b,c,d,w) * 5.0;`
2022-08-02 21:53:17 +02:00
Johannes J
f4d31fbf6c DRW: Fix signed/unsigned mismatches in shader code
Fix the following error messages on Blender startup
since commit 308a12ac64.

This commit fixes T98194.

Reviewed By: fclem
Differential Revision: https://developer.blender.org/D15007
2022-05-23 16:30:18 +02:00
80859a6cb2 GPU: Make nodetree GLSL Codegen render engine agnostic
This commit removes all EEVEE specific code from the `gpu_shader_material*.glsl`
files. It defines a clear interface to evaluate the closure nodes leaving
more flexibility to the render engine.

Some of the long standing workaround are fixed:
- bump mapping support is no longer duplicating a lot of node and is instead
  compiled into a function call.
- bump rewiring to Normal socket is no longer needed as we now use a global
  `g_data.N` for that.


Closure sampling with upstread weight eval is now supported if the engine needs
it.

This also makes all the material GLSL sources use `GPUSource` for better
debugging experience. The `GPUFunction` parsing now happens in `GPUSource`
creation.

The whole `GPUCodegen` now uses the `ShaderCreateInfo` and is object type
agnostic. Is has also been rewritten in C++.

This patch changes a view behavior for EEVEE:
- Mix shader node factor imput is now clamped.
- Tangent Vector displacement behavior is now matching cycles.
- The chosen BSDF used for SSR might change.
- Hair shading may have very small changes on very large hairs when using hair
  polygon stripes.
- ShaderToRGB node will remove any SSR and SSS form a shader.
- SSS radius input now is no longer a scaling factor but defines an average
  radius. The SSS kernel "shape" (radii) are still defined by the socket default
  values.

Appart from the listed changes no other regressions are expected.
2022-04-14 18:47:58 +02:00
Jason Fielder
19c793af35 Metal: Make GLSL shader source MSL compliant also
Metal shading language follows the C++ 14 standard and in some cases requires a greater level of explicitness than GLSL. There are also some small language differences:

- Explicit type-casts (C++ requirements)
- Explicit constant values (C++ requirements, e.g. floating point values using 0.0 instead of 0).
- Metal/OpenGL compatibility paths
- GLSL Function prototypes
- Explicit accessors for vector types when sampling textures.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D14378
2022-03-22 12:54:44 +01:00
472fc0c558 DRW: Port improvements in common_math_lib.glsl from eevee-rewrite
Mostly quality of life functions and macros.
2022-03-19 22:05:34 +01:00
Hallam Roberts
3da84d8b08 Cleanup: use M_PI_2 and M_PI_4 where possible
The constant M_PI_4 is added to GLSL to ensure it works there too.

Differential Revision: https://developer.blender.org/D14288
2022-03-11 18:27:58 +01:00
e1a1dc868b Cleanup: fix source comment typos
Contributed by luzpaz.

Differential Revision: https://developer.blender.org/D14307
2022-03-11 18:27:58 +01:00
c369382977 EEVEE: Fix NaN caused by ensure_valid_reflection()
This was caused by unsafe sqrt calls.

Fixes T86578 white artifacts in EEVEE

Reviewed By: brecht, dfelinto

Differential Revision: https://developer.blender.org/D11428
2021-05-28 18:16:56 +02:00
4cf60a2abf Fix T87369 EEVEE: Ambient Oclussion: Firefly caused by degenerated normal
This was caused by some sort of degenerated normals.
2021-04-20 16:07:45 +02:00
27cfc1ea11 Fix T87541 EEVEE: AO causes black outline around objects and NaN pixels
It seems the pow result is unstable on some implementations.

Also avoid undefined behavior by clamping aoFactor to strict positive values.
2021-04-20 10:59:07 +02:00
ba75ea8012 EEVEE: Use Fullscreen maxZBuffer instead of halfres
This removes the need for per mipmap scalling factor and trilinear interpolation
issues. We pad the texture so that all mipmaps have pixels in the next mip.

This simplifies the downsampling shader too.

This also change the SSR radiance buffer as well in the same fashion.
2021-03-08 17:25:16 +01:00
64d96f68d6 EEVEE: Ambient Occlusion: Refactor
- Fix noise/banding artifact on distant geometry.
- Fix overshadowing on un-occluded surfaces at grazing angle producing "fresnel"
  like shadowing. Some of it still appears but this is caused to the low number
  of horizons per pixel.
- Improve performance by using a fixed number of samples and fixing the
  sampling area size. A better sampling pattern is planned to recover
  the lost precision on large AO radius.
- Improved normal reconstruction for the AO pass.
- Improve Bent Normal reconstruction resulting in less faceted look on
  smoothed geometry.
- Add Thickness heuristic to avoid overshadowing of thin objects.
  Factor is currently hardcoded.
- Add bent normal support to Glossy reflections.
- Change Glossy occlusion to give less light leaks from lightprobes.
  It can overshadow on smooth surface but this should be mitigated by
  using SSR.
- Use Bent Normal for rough Glossy surfaces.
- Occlusion is now correctly evaluated for each BSDF. However this does make
  everything slower. This is mitigated by the fact the search is a lot faster
  than before.
2021-02-21 01:33:56 +01:00
7f7e683099 EEVEE: Refactor closure_lit_lib.glsl
This refactor was needed for some reasons:
- closure_lit_lib.glsl was unreadable and could not be easily extended to use new features.
- It was generating ~5K LOC for any shader. Slowing down compilation.
- Some calculations were incorrect and BSDF/Closure code had lots of workaround/hacks.

What this refactor does:
- Add some macros to define the light object loops / eval.
- Clear separation between each closures which now have separate files. Each closure implements the eval functions.
- Make principled BSDF a bit more correct in some cases (specular coloring, mix between glass and opaque).
- The BSDF term are applied outside of the eval function and on the whole lighting (was separated for lights before).
- Make light iteration last to avoid carrying more data than needed.
- Makes sure that all inputs are within correct ranges before evaluating the closures (use `safe_normalize` on normals).
- Making each BSDF isolated means that we might carry duplicated data (normals for instance) but this should be optimized by compilers.
- Makes Translucent BSDF its own closure type to avoid having to disable raytraced shadows using hacks.
- Separate transmission roughness is now working on Principled BSDF.
- Makes principled shader variations using constants. Removing a lot of duplicated code. This needed `const` keyword detection in `gpu_material_library.c`.
- SSR/SSS masking and data loading is a bit more consistent and defined outside of closure eval. The loading functions will act as accumulator if the lighting is not to be separated.
- SSR pass now do a full deferred lighting evaluation, including lights, in order to avoid interference with the closure eval code. However, it seems that the cost of having a global SSR toggle uniform is making the surface shader more expensive (which is already the case, by the way).
- Principle fully black specular tint now returns black instead of white.
- This fixed some artifact issue on my AMD computer on normal surfaces (which might have been some uninitialized variables).
- This touched the Ambient Occlusion because it needs to be evaluated for each closure. But to avoid the cost of this, we use another approach to just pass the result of the occlusion on interpolated normals and modify it using the bent normal for each Closure. This tends to reduce shadowing. I'm still looking into improving this but this is out of the scope of this patch.
- Performance might be a bit worse with this patch since it is more oriented towards code modularity. But not by a lot.

Render tests needs to be updated after this.

Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D10390

# Conflicts:
#	source/blender/draw/engines/eevee/eevee_shaders.c
#	source/blender/draw/engines/eevee/shaders/common_utiltex_lib.glsl
#	source/blender/draw/intern/shaders/common_math_lib.glsl
2021-02-13 18:43:09 +01:00
000a340afa EEVEE: Depth of field: New implementation
This is a complete refactor over the old system. The goal was to increase quality
first and then have something more flexible and optimised.

|{F9603145} | {F9603142}|{F9603147}|

This fixes issues we had with the old system which were:
- Too much overdraw (low performance).
- Not enough precision in render targets (hugly color banding/drifting).
- Poor resolution near in-focus regions.
- Wrong support of orthographic views.
- Missing alpha support in viewport.
- Missing bokeh shape inversion on foreground field.
- Issues on some GPUs. (see T72489) (But I'm sure this one will have other issues as well heh...)
- Fix T81092

I chose Unreal's Diaphragm DOF as a reference / goal implementation.
It is well described in the presentation "A Life of a Bokeh" by Guillaume Abadie.
You can check about it here https://epicgames.ent.box.com/s/s86j70iamxvsuu6j35pilypficznec04

Along side the main implementation we provide a way to increase the quality by jittering the
camera position for each sample (the ones specified under the Sampling tab).

The jittering is dividing the actual post processing dof radius so that it fills the undersampling.
The user can still add more overblur to have a noiseless image, but reducing bokeh shape sharpness.

Effect of overblur (left without, right with):
| {F9603122} | {F9603123}|

The actual implementation differs a bit:
- Foreground gather implementation uses the same "ring binning" accumulator as background
  but uses a custom occlusion method. This gives the problem of inflating the foreground elements
  when they are over background or in-focus regions.
  This is was a hard decision but this was preferable to the other method that was giving poor
  opacity masks for foreground and had other more noticeable issues. Do note it is possible
  to improve this part in the future if a better alternative is found.
- Use occlusion texture for foreground. Presentation says it wasn't really needed for them.
- The TAA stabilisation pass is replace by a simple neighborhood clamping at the reduce copy
  stage for simplicity.
- We don't do a brute-force in-focus separate gather pass. Instead we just do the brute force
  pass during resolve. Using the separate pass could be a future optimization if needed but
  might give less precise results.
- We don't use compute shaders at all so shader branching might not be optimal. But performance
  is still way better than our previous implementation.
- We mainly rely on density change to fix all undersampling issues even for foreground (which
  is something the reference implementation is not doing strangely).

Remaining issues (not considered blocking for me):
- Slight defocus stability: Due to slight defocus bruteforce gather using the bare scene color,
  highlights are dilated and make convergence quite slow or imposible when using jittered DOF
  (or gives )
- ~~Slight defocus inflating: There seems to be a 1px inflation discontinuity of the slight focus
  convolution compared to the half resolution. This is not really noticeable if using jittered
  camera.~~ Fixed
- Foreground occlusion approximation is a bit glitchy and gives incorrect result if the
  a defocus foreground element overlaps a farther foreground element. Note that this is easily
  mitigated using the jittered camera position.
|{F9603114}|{F9603115}|{F9603116}|
- Foreground is inflating,  not revealing background. However this avoids some other bugs too
  as discussed previously. Also mitigated with jittered camera position.
|{F9603130}|{F9603129}|
- Sensor vertical fit is still broken (does not match cycles).
- Scattred bokeh shapes can be a bit strange at polygon vertices. This is due to the distance field
  stored in the Bokeh LUT which is not rounded at the edges. This is barely noticeable if the
  shape does not rotate.
- ~~Sampling pattern of the jittered camera position is suboptimal. Could try something like hammersley
  or poisson disc distribution.~~Used hexaweb sampling pattern which is not random but has better
stability and overall coverage.
- Very large bokeh (> 300 px) can exhibit undersampling artifact in gather pass and quite a bit of
  bleeding. But at this size it is preferable to use jittered camera position.

Codewise the changes are pretty much self contained and each pass are well documented.
However the whole pipeline is quite complex to understand from bird's-eye view.

Notes:
- There is the possibility of using arbitrary bokeh texture with this implementation.
  However implementation is a bit involved.
- Gathering max sample count is hardcoded to avoid to deal with shader variations. The actual
  max sample count is already quite high but samples are not evenly distributed due to the
  ring binning method.
- While this implementation does not need 32bit/channel textures to render correctly it does use
  many other textures so actual VRAM usage is higher than previous method for viewport but less
  for render. Textures are reused to avoid many allocations.
- Bokeh LUT computation is fast and done for each redraw because it can be animated. Also the
  texture can be shared with other viewport with different camera settings.
2021-02-12 22:35:52 +01:00
cd8f3c9ee7 DRW: Add glsl math libraries
Copied from eevee bsdf_common_lib.glsl
2020-07-15 19:51:55 +02:00