Implement GBuffer prepass and deferred lighting (lights only).
This decouple lighting from the material shaders making them lighter,
less expensive and faster to compile.
Trying to keep a nice data flow so we could potentially use the
subpass programable blending feature on tiled GPU arch.
Not everything is covered yet and #105880 is making the GBuffer layout
a bit awkward and not easily extendable.
Pull Request: blender/blender#105868
Optimized node graphs do not get cached and were
not correctly freed once their reference count reached
zero, due to being excluded from the GPUPass garbage
collection.
Also suppress Metal shader warnings, which are prevalent
during material optimization.
Authored by Apple: Michael Parkin-White
Pull Request: blender/blender#105795
This patch implements the bicubic interpolation option in the transform
nodes. The path merely reuse the code in the shader image texture and
adds bicubic variants to the domain realization shader.
Pull Request: blender/blender#105533
**What are push constants?**
Push constants is a way to quickly provide a small amount of uniform data to shaders.
It should be much quicker than UBOs but a huge limitation is the size of data - spec
requires 128 bytes to be available for a push constant range.
**What are the challenges with push constants?**
The challenge with push constants is that the limited available size. According to
the Vulkan spec each platform should at least have 128 bytes reserved for push
constants. Current Mesa/AMD drivers supports 256 bytes, but Mesa/Intel is only 128
bytes.
**What is our solution?**
Some shaders of Blender uses more than these boundaries. When more data is needed
push constants will not be used, but the shader will be patched to use an uniform
buffer instead. This mechanism will be part of the Vulkan backend and shader
developers should not see any difference on API level.
**Known limitations**
Current state of the vulkan backend does not track resources that are in the
command queue. This patch includes some test cases that identified this issue as
well. See #104771.
Pull Request #104880
GPencil 3D stroke rendering uses a geometry shader.
This is unsupported by the Metal backend, so implement
fix for this failing compilation by shifting geometry shader
logic into the Vertex shader for Metal backend.
Authored by Apple: Michael Parkin-White
Ref #96261
Pull Request #105143
Resolves issue with nearest filtering on UI Icons. Note that as
Metal does not support LOD bias as a parameter on a sampler
object, the original code has been modified to perform LOD
biasing at the shader level.
As GPU_SAMPLER_ICON is not widely used, it is more
efficient to apply directly to the affected shaders, rather
than workaround passing in the sampler LOD bias as a
separate value e.g. uniform or push constant.
Original PR feedback addressed to also refactor ICON
shaders to use consistent style for single and multi
Icon rendering.
Authored by Apple: Michael Parkin-White
Ref #96261
Pull Request #105145
Cycles fallback display shader previously did not use viewport.
This would crash or cause the display not to show when using
GPU backends other than OpenGL, if another display shader
was unavailable.
Now use ShaderCreateInfo for Cycles fallback display.
Authored by Apple: Michael Parkin-White
Ref #96261
Pull Request #104987
The stoke shader of grease pencil uses a geometry shader stage. Apple
devices don't support shaders with geometry shader stage. In the
OpenGL driver there was a pass-through implemented so it didn't fail.
When using the metal backend this needs to be solved more explicitly.
This change patches the grease pencil shader to support both the
backends supporting a geometry stage and those without.
Fixes#105059
Pull Request #105116
This patch adds initial support for compute shaders to
the vulkan backend. As the development is oriented to the test-
cases we have the implementation is limited to what is used there.
It has been validated that with this patch that the following test
cases are running as expected
- `GPUVulkanTest.gpu_shader_compute_vbo`
- `GPUVulkanTest.gpu_shader_compute_ibo`
- `GPUVulkanTest.gpu_shader_compute_ssbo`
- `GPUVulkanTest.gpu_storage_buffer_create_update_read`
- `GPUVulkanTest.gpu_shader_compute_2d`
This patch includes:
- Allocating VkBuffer on device.
- Uploading data from CPU to VkBuffer.
- Binding VkBuffer as SSBO to a compute shader.
- Execute compute shader and altering VkBuffer.
- Download the VkBuffer to CPU ram.
- Validate that it worked.
- Use device only vertex buffer as SSBO
- Use device only index buffer as SSBO
- Use device only image buffers
GHOST API has been changed as the original design was created before
we even had support for compute shaders in blender. The function
`GHOST_getVulkanBackbuffer` has been separated to retrieve the command
buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order
to do correct command buffer processing we needed access to the queue
owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles`
function.
Open topics (not considered part of this patch)
- Memory barriers & command buffer encoding
- Indirect compute dispatching
- Rest of the test cases
- Data conversions when requested data format is different than on device.
- GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices.
NVIDIA doesn't seem to support 1d textures.
Pull-request: #104518
This replaces `GPU_SHADER_3D_POINT_FIXED_SIZE_VARYING_COLOR` by
GPU_SHADER_2D_POINT_UNIFORM_SIZE_UNIFORM_COLOR_OUTLINE_AA`.
None of the usage made sense to not use the AA shader.
Scale the point size to account for the rounded shape.
Keep using the 3 evaluations dF_branch method for the Displacement output.
The optimized 2 evaluations method used by node_bump is now on its own macro (dF_branch_incomplete).
displacement_bump modifies the normal that nodetree_exec uses, so even with a refactor it wouldn’t be possible to re-use the computation anyway.
Avoid computing the non-derivative height twice.
The height is now computed as part of the main function, while the height at x and y offsets are still computed on a separate function.
The differentials are now computed directly at node_bump.
Co-authored-by: Miguel Pozo <pragma37@gmail.com>
Pull Request #104595
Implements virtual shadow mapping for EEVEE-Next primary shadow solution.
This technique aims to deliver really high precision shadowing for many
lights while keeping a relatively low cost.
The technique works by splitting each shadows in tiles that are only
allocated & updated on demand by visible surfaces and volumes.
Local lights use cubemap projection with mipmap level of detail to adapt
the resolution to the receiver distance.
Sun lights use clipmap distribution or cascade distribution (depending on
which is better) for selecting the level of detail with the distance to
the camera.
Current maximum shadow precision for local light is about 1 pixel per 0.01
degrees.
For sun light, the maximum resolution is based on the camera far clip
distance which sets the most coarse clipmap.
## Limitation:
Alpha Blended surfaces might not get correct shadowing in some corner
casses. This is to be fixed in another commit.
While resolution is greatly increase, it is still finite. It is virtually
equivalent to one 8K shadow per shadow cube face and per clipmap level.
There is no filtering present for now.
## Parameters:
Shadow Pool Size: In bytes, amount of GPU memory to dedicate to the
shadow pool (is allocated per viewport).
Shadow Scaling: Scale the shadow resolution. Base resolution should
target subpixel accuracy (within the limitation of the technique).
Related to #93220
Related to #104472
Replace texelFetch calls with a texture point-sample rather than a textureRead call. This increases texture cache utilisation when mixing between sampled calls and reads. Bounds checking can also be removed from these functions, reducing instruction count and branch divergence, as the sampler routine handles range clamping.
Authored by Apple: Michael Parkin-White
Ref T96261
Depends on D16923
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D17021
The glsl files + create infos of shaders that are only used
during development where still being compiled into blender.
This isn't needed and shouldn't be included. This change will
only include them when WITH_GTEST and WITH_OPENGL_DRAW_TESTS are
enabled. All other cases those files will be skipped.
Due to shader global scope emulation via class interface, global constant arrays in shaders are allocated in per-thread shader local memory. To reduce memory pressure, placing these constant arrays inside function scope will ensure they only reside within device constant memory. This results in a tangible 1.5-2x performance uplift for the specific shaders affected.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D17089
This patch adds support for compilation and execution of GLSL compute shaders. This, along with a few systematic changes and fixes, enable realtime compositor functionality with the Metal backend on macOS. A number of GLSL source modifications have been made to add the required level of type explicitness, allowing all compilations to succeed.
GLSL Compute shader compilation follows a similar path to Vertex/Fragment translation, with added support for shader atomics, shared memory blocks and barriers.
Texture flags have also been updated to ensure correct read/write specification for textures used within the compositor pipeline. GPU command submission changes have also been made in the high level path, when Metal is used, to address command buffer time-outs caused by certain expensive compute shaders.
Authored by Apple: Michael Parkin-White
Ref T96261
Ref T99210
Reviewed By: fclem
Maniphest Tasks: T99210, T96261
Differential Revision: https://developer.blender.org/D16990
Recent changes in our GLSL libraries didn't compile on Vulkan. This
change reverts a compile directive that was removed, but required
in order to compile using the Vulkan backend.
Affecting render output preview when tone mapping is used, and EEVEE scenes such as Mr Elephant rendering in pink due to missing shaders.
Authored by Apple: Michael Parkin-White
Ref T103635
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T103635, T96261
Differential Revision: https://developer.blender.org/D16923
These allow the usage of `atomicMin` and `atomicMax` function with float
values as there is no overload for these types in GLSL.
This also allows signed 0 preservation.
- Matrix normalize overloads needs to have the vector normalize redefined.
- double underscore (anywhere in symbol name) are reserved.
- Some operation yield different result due to float imprecision. Increasing
epsilon threshold for the failing tests.
This implement most of the functions provided by the BLI math library.
This is part of the effort to unify GLSL and C++ syntax. Ref T103026.
This also adds some infrastructure to make it possible to run GLSL shader unit
test.
Some code already present in other libs is being copied to the new libs.
This patch does not make use of the new libs outside of the tests.
Note that the test is still crashing when using metal.
Although viewport compositor isn't supported yet on Apple deviced the
shaderlibs are compiled. The compositor shaders uses mat3 constructor
with a single vec3 and 6 floats. This constructor wasn't defined in
metal so it failed the compilation.
This patch adds the override to mat3.
* Add missing constructor for squared matrix types.
* Use using instead of define for typenames.
* Add `==, !=, unary-` operators
* Catch all functional style constructors inside the regex
Add required additional_info to shader using gpu_shader_point_uniform_color_aa_frag.glsl. This is the only other reference to the shader which requires the additional_info to be added.
{F14085803}
Reviewed By: #eevee_viewport, fclem
Maniphest Tasks: T103426
Differential Revision: https://developer.blender.org/D16853
Convert 3D point shader fragment color from sRGB space to framebuffer space to match 3D line shader.
Reviewed By: fclem
Maniphest Tasks: T97394
Differential Revision: https://developer.blender.org/D16831
A number of paths resulted in compilation errors after porting EEVEE to use Create-Info. Namely the fallback path for cubemap support. A number of other strict compilation failures regarding format comparison also required fixing when this mode is enabled.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T96261, T103313
Differential Revision: https://developer.blender.org/D16819
The Metal backend already supports output for the 6 clipping planes via gl_ClipDistances equivalent, however, functionality to toggle clipping plane enablement was missing.
Authored by Apple: Michael Parkin-White
Ref T96261
Depends on D16777
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D16813
Expands Color Mix nodes with new Exclusion mode.
Similar to Difference but produces less contrast.
Requested by Pierre Schiller @3D_director and
@OmarSquircleArt on twitter.
Differential Revision: https://developer.blender.org/D16543
Compile each static shader using shaderc to Spir-V binaries.
The main goal is to make sure that the GLSL created using ShaderCreateInfo and able to compile to Spir-V.
For the second stage a correct pipeline needs to be created and some shader would need more
adjustments (push constants size).
With this patch future changes to GLSL sources can already be checked against vulkan, without the
backend finished.
Mechanism has been tested using MacOS and MoltenVK. For other OS, we should finetune CMake
files to find the right location to shaderc.
```
************************************************************
*** Build Mon 12 Dec 2022 11:08:07 CET
************************************************************
Shader Test compilation result: 463 / 463 passed (skipped 118 for compatibility reasons)
OpenGL backend shader compilation succeeded.
Shader Test compilation result: 529 / 529 passed (skipped 52 for compatibility reasons)
Vulkan backend shader compilation succeeded.
```
Reviewed By: fclem
Maniphest Tasks: T102760
Differential Revision: https://developer.blender.org/D16610