Commit Graph

952 Commits

Author SHA1 Message Date
7fb1f060ff Vulkan: Initial Compute Shaders support
This patch adds initial support for compute shaders to
the vulkan backend. As the development is oriented to the test-
cases we have the implementation is limited to what is used there.

It has been validated that with this patch that the following test
cases are running as expected
- `GPUVulkanTest.gpu_shader_compute_vbo`
- `GPUVulkanTest.gpu_shader_compute_ibo`
- `GPUVulkanTest.gpu_shader_compute_ssbo`
- `GPUVulkanTest.gpu_storage_buffer_create_update_read`
- `GPUVulkanTest.gpu_shader_compute_2d`

This patch includes:
- Allocating VkBuffer on device.
- Uploading data from CPU to VkBuffer.
- Binding VkBuffer as SSBO to a compute shader.
- Execute compute shader and altering VkBuffer.
- Download the VkBuffer to CPU ram.
- Validate that it worked.
- Use device only vertex buffer as SSBO
- Use device only index buffer as SSBO
- Use device only image buffers

GHOST API has been changed as the original design was created before
we even had support for compute shaders in blender. The function
`GHOST_getVulkanBackbuffer` has been separated to retrieve the command
buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order
to do correct command buffer processing we needed access to the queue
owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles`
function.

Open topics (not considered part of this patch)
- Memory barriers & command buffer encoding
- Indirect compute dispatching
- Rest of the test cases
- Data conversions when requested data format is different than on device.
- GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices.
  NVIDIA doesn't seem to support 1d textures.

Pull-request: #104518
2023-02-21 15:04:52 +01:00
83a6642045 Cleanup: GPU: Move eGPUKeyframeShapes to shader shared
Removes code duplication.
2023-02-13 11:22:38 +01:00
d165d6aa2a GPU: Remove GPU_SHADER_3D_POINT_FIXED_SIZE_VARYING_COLOR
This replaces `GPU_SHADER_3D_POINT_FIXED_SIZE_VARYING_COLOR` by
GPU_SHADER_2D_POINT_UNIFORM_SIZE_UNIFORM_COLOR_OUTLINE_AA`.

None of the usage made sense to not use the AA shader.
Scale the point size to account for the rounded shape.
2023-02-13 11:22:38 +01:00
77963ff778 Fix #104637: EEVEE Displacement regression after #104595
Keep using the 3 evaluations dF_branch method for the Displacement output.
The optimized 2 evaluations method used by node_bump is now on its own macro (dF_branch_incomplete).
displacement_bump modifies the normal that nodetree_exec uses, so even with a refactor it wouldn’t be possible to re-use the computation anyway.
2023-02-12 23:06:21 +01:00
91346755ce Cleanup: use '#' prefix for issues instead of 'T'
Match the convention from Gitea instead of Phabricator's T for tasks.
2023-02-12 14:56:05 +11:00
efabe81c91 Fix #103903: Bump Node performance regression
Avoid computing the non-derivative height twice.
The height is now computed as part of the main function, while the height at x and y offsets are still computed on a separate function.
The differentials are now computed directly at node_bump.

Co-authored-by: Miguel Pozo <pragma37@gmail.com>
Pull Request #104595
2023-02-10 21:06:53 +01:00
0381fe7bfe Cleanup: update username in code-comments: campbellbarton -> ideasman42
Gitea migration changed my username, update code-comments.
2023-02-09 11:33:48 +11:00
a0f5240089 EEVEE-Next: Virtual Shadow Map initial implementation
Implements virtual shadow mapping for EEVEE-Next primary shadow solution.
This technique aims to deliver really high precision shadowing for many
lights while keeping a relatively low cost.

The technique works by splitting each shadows in tiles that are only
allocated & updated on demand by visible surfaces and volumes.
Local lights use cubemap projection with mipmap level of detail to adapt
the resolution to the receiver distance.
Sun lights use clipmap distribution or cascade distribution (depending on
which is better) for selecting the level of detail with the distance to
the camera.

Current maximum shadow precision for local light is about 1 pixel per 0.01
degrees.
For sun light, the maximum resolution is based on the camera far clip
distance which sets the most coarse clipmap.

## Limitation:
Alpha Blended surfaces might not get correct shadowing in some corner
casses. This is to be fixed in another commit.
While resolution is greatly increase, it is still finite. It is virtually
equivalent to one 8K shadow per shadow cube face and per clipmap level.
There is no filtering present for now.

## Parameters:
Shadow Pool Size: In bytes, amount of GPU memory to dedicate to the
shadow pool (is allocated per viewport).
Shadow Scaling: Scale the shadow resolution. Base resolution should
target subpixel accuracy (within the limitation of the technique).

Related to #93220
Related to #104472
2023-02-08 21:18:44 +01:00
Jason Fielder
f3bd5458a3 Metal: Optimise shader texture cache usage and branch reduction via point sampling.
Replace texelFetch calls with a texture point-sample rather than a textureRead call. This increases texture cache utilisation when mixing between sampled calls and reads. Bounds checking can also be removed from these functions, reducing instruction count and branch divergence, as the sampler routine handles range clamping.

Authored by Apple: Michael Parkin-White
Ref T96261

Depends on D16923

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17021
2023-01-31 10:56:25 +01:00
6b8fa899ca Metal: Fix compilation of GLSL used in test cases.
Added imageStore for 1d textures.
2023-01-31 08:42:33 +01:00
87a923fdb6 GPU: Add SSBO binding test to new structure.
This test was added to test the shader info structure binding
information for SSBOs. It still used the legacy GLSL structure.
2023-01-30 19:07:33 +01:00
ce13d0d326 GPU: Only compile test shaders when test cases option is enabled.
The glsl files + create infos of shaders that are only used
during development where still being compiled into blender.

This isn't needed and shouldn't be included. This change will
only include them when WITH_GTEST and WITH_OPENGL_DRAW_TESTS are
enabled. All other cases those files will be skipped.
2023-01-30 15:46:12 +01:00
Jason Fielder
596ee79a9f Metal: Optimize shader local memory usage.
Due to shader global scope emulation via class interface, global constant arrays in shaders are allocated in per-thread shader local memory. To reduce memory pressure, placing these constant arrays inside function scope will ensure they only reside within device constant memory. This results in a tangible 1.5-2x performance uplift for the specific shaders affected.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D17089
2023-01-30 13:53:00 +01:00
0da74d3ee9 GPU: Fix GLSL compilation on OpenGL backend.
Regression in {d0f55aa671f7}
2023-01-30 12:23:02 +01:00
Jason Fielder
57552f52b2 Metal: Realtime compositor enablement with addition of GPU Compute.
This patch adds support for compilation and execution of GLSL compute shaders. This, along with a few systematic changes and fixes, enable realtime compositor functionality with the Metal backend on macOS. A number of GLSL source modifications have been made to add the required level of type explicitness, allowing all compilations to succeed.

GLSL Compute shader compilation follows a similar path to Vertex/Fragment translation, with added support for shader atomics, shared memory blocks and barriers.

Texture flags have also been updated to ensure correct read/write specification for textures used within the compositor pipeline. GPU command submission changes have also been made in the high level path, when Metal is used, to address command buffer time-outs caused by certain expensive compute shaders.

Authored by Apple: Michael Parkin-White

Ref T96261
Ref T99210

Reviewed By: fclem

Maniphest Tasks: T99210, T96261

Differential Revision: https://developer.blender.org/D16990
2023-01-30 11:06:56 +01:00
d0f55aa671 Vulkan: Fix GLSL compilation errors.
Recent changes in our GLSL libraries didn't compile on Vulkan. This
change reverts a compile directive that was removed, but required
in order to compile using the Vulkan backend.
2023-01-30 10:55:14 +01:00
Jason Fielder
0ba5954bb2 Fix T103635: Fix failing EEVEE and OCIO shader compilations in Metal.
Affecting render output preview when tone mapping is used, and EEVEE scenes such as Mr Elephant rendering in pink due to missing shaders.

Authored by Apple: Michael Parkin-White

Ref T103635
Ref T96261

Reviewed By: fclem

Maniphest Tasks: T103635, T96261

Differential Revision: https://developer.blender.org/D16923
2023-01-23 17:40:10 +01:00
17768b3df1 BLI: Math: Fix perspective matrix function
The port missed this one component that should have been left to 0.0.
2023-01-17 21:02:57 +01:00
28db19433e GPU: Math: Add floatBitsToOrderedInt and orderedIntBitsToFloat
These allow the usage of `atomicMin` and `atomicMax` function with float
values as there is no overload for these types in GLSL.

This also allows signed 0 preservation.
2023-01-16 00:39:57 +01:00
71ca339fe0 GPU: Fix math lib compilation and tests on AMD drivers
- Matrix normalize overloads needs to have the vector normalize redefined.
- double underscore (anywhere in symbol name) are reserved.
- Some operation yield different result due to float imprecision. Increasing
  epsilon threshold for the failing tests.
2023-01-09 20:41:16 +01:00
59ce3b8f6b Cleanup: doxygen comment use
Avoid '\note' outside of doxygen comments.
2023-01-09 18:56:17 +11:00
125b283589 GPU: Add Math libraries to GPU shaders code
This implement most of the functions provided by the BLI math library.
This is part of the effort to unify GLSL and C++ syntax. Ref T103026.

This also adds some infrastructure to make it possible to run GLSL shader unit
test.

Some code already present in other libs is being copied to the new libs.
This patch does not make use of the new libs outside of the tests.

Note that the test is still crashing when using metal.
2023-01-06 22:33:23 +01:00
d7598c8081 Metal: Fix crash when compiling compositor shaders.
Although viewport compositor isn't supported yet on Apple deviced the
shaderlibs are compiled. The compositor shaders uses mat3 constructor
with a single vec3 and 6 floats. This constructor wasn't defined in
metal so it failed the compilation.

This patch adds the override to mat3.
2023-01-05 08:30:41 +01:00
a6355a4542 Metal: Add support for all matrix types
* Add missing constructor for squared matrix types.
* Use using instead of define for typenames.
* Add `==, !=, unary-` operators
* Catch all functional style constructors inside the regex
2023-01-03 17:56:25 +01:00
Eimear Crotty
a521960fdd Fix T103426: Crash on Insert Keyframe
Add required additional_info to shader using gpu_shader_point_uniform_color_aa_frag.glsl. This is the only other reference to the shader which requires the additional_info to be added.

{F14085803}

Reviewed By: #eevee_viewport, fclem

Maniphest Tasks: T103426

Differential Revision: https://developer.blender.org/D16853
2022-12-23 11:39:49 +01:00
Eimear Crotty
a7ad2dea62 Fix T97394: single points are brighter than stroke in 3D viewport
Convert 3D point shader fragment color from sRGB space to framebuffer space to match 3D line shader.

Reviewed By: fclem

Maniphest Tasks: T97394

Differential Revision: https://developer.blender.org/D16831
2022-12-22 14:22:56 +01:00
Jason Fielder
dedca2c994 Fix T103313: Resolve shader compilation failures when enabling GPU workarounds.
A number of paths resulted in compilation errors after porting EEVEE to use Create-Info. Namely the fallback path for cubemap support. A number of other strict compilation failures regarding format comparison also required fixing when this mode is enabled.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261, T103313

Differential Revision: https://developer.blender.org/D16819
2022-12-20 14:22:09 +01:00
Jason Fielder
b3464fe152 Metal: Implement suppot for clip plane toggling via GPU_clip_distances.
The Metal backend already supports output for the 6 clipping planes via gl_ClipDistances equivalent, however, functionality to toggle clipping plane enablement was missing.

Authored by Apple: Michael Parkin-White

Ref T96261
Depends on D16777

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D16813
2022-12-20 14:16:23 +01:00
0cc573c8c4 Cleanup: white space around comment blocks 2022-12-17 15:58:30 +11:00
bea5fe6505 Nodes: Add Exclusion color mix mode
Expands Color Mix nodes with new Exclusion mode.

Similar to Difference but produces less contrast.

Requested by Pierre Schiller @3D_director and
@OmarSquircleArt on twitter.

Differential Revision: https://developer.blender.org/D16543
2022-12-16 15:42:41 +00:00
Jeroen Bakker
9c0d822737 GPU: Compile vulkan shaders to Spir-V binaries.
Compile each static shader using shaderc to Spir-V binaries.

The main goal is to make sure that the GLSL created using ShaderCreateInfo and able to compile to Spir-V.
For the second stage a correct pipeline needs to be created and some shader would need more
adjustments (push constants size).

With this patch future changes to GLSL sources can already be checked against vulkan, without the
backend finished.

Mechanism has been tested using MacOS and MoltenVK. For other OS, we should finetune CMake
files to find the right location to shaderc.

```
************************************************************
*** Build Mon 12 Dec 2022 11:08:07 CET
************************************************************
Shader Test compilation result: 463 / 463 passed (skipped 118 for compatibility reasons)
OpenGL backend shader compilation succeeded.
Shader Test compilation result: 529 / 529 passed (skipped 52 for compatibility reasons)
Vulkan backend shader compilation succeeded.
```

Reviewed By: fclem

Maniphest Tasks: T102760

Differential Revision: https://developer.blender.org/D16610
2022-12-12 12:25:22 +01:00
f898190362 GPU: Fix static compilation errors
- Missing explicit cast to `int` for bitwise operator.
- UBO struct member macro colision. Rename fixes it.
2022-12-09 00:10:14 +01:00
237fd48d01 Metal: Add back static compilation for no_geom shaders
These are metal specific shaders and needed to be tagged as such before
enabling static compilation.
2022-12-08 23:32:17 +01:00
Jason Fielder
d90a2b0ab7 Metal: GLSL compatibility.
Additional mat3 constructors added, global variable namespace collisions
for uniform and object color avoided via re-name.

Metal vertex format compatibility added for shaders wherein vertex data
goes through a double-conversion and cannot be implicitly converted during
Metal vertex assembly e.g. bitmasks passed directly as unsigned type in
shader interface for certain shader interfaces.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem
Differential Revision: https://developer.blender.org/D16433
2022-12-08 21:30:13 +01:00
6b8bb26c45 EEVEE: Port existing EEVEE shaders and generated materials to use GPUShaderCreateInfo.
Required by Metal backend for efficient shader compilation. EEVEE material
resource binding permutations now controlled via CreateInfo and selected
based on material options. Other existing CreateInfo's also modified to
ensure explicitness for depth-writing mode. Other missing bindings also
addressed to ensure full compliance with the Metal backend.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D16243
2022-12-08 21:12:19 +01:00
21eeedfc60 Fix T101562: GPU: Update fmod to match Cycles/OSL behavior
Implementation from:
https://github.com/AcademySoftwareFoundation/OpenShadingLanguage/blob/main/src/include/OSL/dual.h#L1283
https://github.com/OpenImageIO/oiio/blob/master/src/include/OpenImageIO/fmath.h#L1605

Reviewed By: fclem

Maniphest Tasks: T101562

Differential Revision: https://developer.blender.org/D16497
2022-11-15 13:04:47 +01:00
bc8b15f1a5 Realtime Compositor: Move shaders to compositor module
This patch moves the GLSL shaders and their infos to the compositor
module as decided by the EEVEE & Viewport module. This is a non
functional change.

Differential Revision: https://developer.blender.org/D16360

Reviewed By: Clement Foucault
2022-11-02 13:55:23 +02:00
3225bc2e7f GPU: Fix Metal GLSL compilation errors due to recent changes.
vec.st is legacy OpenGL and should not be used.
2022-10-21 14:58:27 +02:00
899d4ddbd0 Fix: Bokeh blur node flips its bokeh input
The bokeh blur node flipped its bokeh input due to the conceptual
difference between the search window space and the weights texture
space. This patches fixes that by inverting the weights texture to match
the search window.

The variable size option actually flips the bokeh input for the CPU
compositor. It is unclear if this is expected, so we deviate from that
behavior for now.
2022-10-21 14:05:19 +02:00
84825e4ed2 UI: Icon number indicator for data-blocks
Adds the possibility of having a little number on top of icons.

At the moment this is used for:
* Outliner
* Node Editor bread-crumb
* Node Group node header

For the outliner there is almost no functional change. It is mostly a refactor
to handle the indicators as part of the icon shader instead of the outliner
draw code. (note that this was already recently changed in a5d3b648e3).

The difference is that now we use rounded border rectangle instead of
circles, and we can go up to 999 elements.

So for the outliner this shows the number of collapsed elements of a
certain type (e.g., mesh objects inside a collapsed collection).

For the node editors is being used to show the use count for the data-block.
This is important for the node editor, so users know whether the node-group
they are editing (or are about to edit) is used elsewhere. This is
particularly important when the Node Options are hidden, which is the
default for node groups appended from the asset libraries.

---

Note: This can be easily enabled for ID templates which can then be part
of T84669. It just need to call UI_but_icon_indicator_number_set in the
function template_add_button_search_menu.

---

Special thanks Clément Foucault for the help figuring out the shader,
Julian Eisel for the help navigating the UI code, and Pablo Vazquez for
the collaboration in this design solution.

For images showing the result check the Differential Revision.
Differential Revision: https://developer.blender.org/D16284
2022-10-20 16:46:54 +02:00
58e25f11ae Realtime Compositor: Implement normalize node
This patch implements the normalize node for the realtime compositor.

Differential Revision: https://developer.blender.org/D16279

Reviewed By: Clement Foucault
2022-10-20 16:32:28 +02:00
7f2cd2d969 Realtime Compositor: Implement Tone Map node
This patch implements the tone map node for the realtime compositor
based on the two papers:

Reinhard, Erik, et al. "Photographic tone reproduction for digital
images." Proceedings of the 29th annual conference on Computer graphics
and interactive techniques. 2002.

Reinhard, Erik, and Kate Devlin. "Dynamic range reduction inspired by
photoreceptor physiology." IEEE transactions on visualization and
computer graphics 11.1 (2005): 13-24.

The original implementation should be revisited later due to apparent
incompatibilities with the reference papers, which makes the operation
less useful.

Differential Revision: https://developer.blender.org/D16306

Reviewed By: Clement Foucault
2022-10-20 15:02:41 +02:00
50943f5dc7 Realtime Compositor: Implement variable size bokeh blur
This patch implements the variable size blur option in the Bokeh Blur
node. The implementation is different from the CPU one in that it also
takes the Bounding Box input into account, which is ignored for some
reason for the CPU. Additionally, this implementation does not do the
optimization where the search radius is limited relative to the maximum
value in the size texture. That's because the cost of computing the
maximum is not worth it for most use cases.

The reference implementation does three unexpected things that are
replicated here nonetheless. First, the center bokeh weight is always
ignored and assumed to be 1. Second the size of the center pixel is
taken into account. Third, a unidimensional distance is used instead of
a 2D euclidean one. Those need to be considered independently.

Differential Revision: https://developer.blender.org/D16185

Reviewed By: Clement Foucault
2022-10-11 13:40:48 +02:00
0037411f55 Realtime Compositor: Implement parallel reduction
This patch implements generic parallel reduction for the realtime
compositor and implements the Levels operation as an example. This patch
also introduces the notion of a "Compositor Algorithm", which is a
reusable operation that can be used to construct other operations.

Differential Revision: https://developer.blender.org/D16184

Reviewed By: Clement Foucault
2022-10-11 13:22:52 +02:00
f61ff22967 Attribute Node: support accessing attributes of View Layer and Scene.
The attribute node already allows accessing attributes associated
with objects and meshes, which allows changing the behavior of the
same material between different objects or instances. The same idea
can be extended to an even more global level of layers and scenes.

Currently view layers provide an option to replace all materials
with a different one. However, since the same material will be applied
to all objects in the layer, varying the behavior between layers while
preserving distinct materials requires duplicating objects.

Providing access to properties of layers and scenes via the attribute
node enables making materials with built-in switches or settings that
can be controlled globally at the view layer level. This is probably
most useful for complex NPR shading and compositing. Like with objects,
the node can also access built-in scene properties, like render resolution
or FOV of the active camera. Lookup is also attempted in World, similar
to how the Object mode checks the Mesh datablock.

In Cycles this mode is implemented by replacing the attribute node with
the attribute value during sync, allowing constant folding to take the
values into account. This means however that materials that use this
feature have to be re-synced upon any changes to scene, world or camera.

The Eevee version uses a new uniform buffer containing a sorted array
mapping name hashes to values, with binary search lookup. The array
is limited to 512 entries, which is effectively limitless even
considering it is shared by all materials in the scene; it is also
just 16KB of memory so no point trying to optimize further.
The buffer has to be rebuilt when new attributes are detected in a
material, so the draw engine keeps a table of recently seen attribute
names to minimize the chance of extra rebuilds mid-draw.

Differential Revision: https://developer.blender.org/D15941
2022-10-08 16:43:18 +03:00
87d737cd79 Cleanup: spelling in code comments 2022-10-06 12:13:00 +11:00
af51e4b41c Cleanup: fix source comment/documentation typos
Contributed by luzpaz.

Differential Revision: https://developer.blender.org/D16071
2022-10-03 21:59:31 +02:00
2356afc7af Cleanup: format, ensure trailing newlines 2022-09-26 21:52:40 +10:00
1b5c94630e GPU: Disable static compilation for geometry shaders workaround
These shaders are only supported by the Metal backed.
Regression introduced by 1514e1a5b7
2022-09-23 23:27:42 +02:00
Jason Fielder
18b45aabf9 Metal: GLSL shader compatibility changes for global uniform and interface name collision.
For the Metal shader translation support for shader-global uniforms are remapped via macro's, and in such cases where a uniform name matches a vertex attribute name, compilation errors will occur due to this injected syntax being incompatible with the immediate code.

Also adding source-level function interface alternatives where sized arrays are passed in. These are not supported directly in Metal shading language and are instead handled as pointers. These pointers require explicit address-space qualifiers in some cases, if device/constant address space memory is passed into the function.

Ref T96261

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D15898
2022-09-22 17:53:56 +02:00