Commit Graph

380 Commits

Author SHA1 Message Date
9b129e5533 Fix #104347: Loop Cut Tool becomes impressive with GPU Subdivision
When updating a mesh, the GPU Subdivision code makes calls to
`GPU_indexbuf_bind_as_ssbo()`.

This may cause the current VAO index buffer to change due to calls from
`glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ibo_id_)` in
`GPU_indexbuf_bind_as_ssbo()`.

The solution is to unbind the VAO (by calling `glBindVertexArray(0)`)
before creating the index buffer IBO.

Co-authored-by: Germano Cavalcante <grmncv@gmail.com>
Pull Request #104873
2023-02-17 10:53:39 -03:00
02c3889b1c Cleanup: quiet clang warnings
Quiet unused argument, shadow, array-bounds & range-loop-bind-reference
warnings.
2023-02-15 13:26:54 +11:00
7b9d1cb51f Eevee: GPU Material node graph optimization.
Certain material node graphs can be very expensive to run. This feature aims to produce secondary GPUPass shaders within a GPUMaterial which provide optimal runtime performance. Such optimizations include baking constant data into the shader source directly, allowing the compiler to propogate constants and perform aggressive optimization upfront.

As optimizations can result in reduction of shader editor and animation interactivity, optimized pass generation and compilation is deferred until all outstanding compilations have completed. Optimization is also delayed util a material has remained unmodified for a set period of time, to reduce excessive compilation. The original variant of the material shader is kept to maintain interactivity.

Also adding a new concept to gpu::Shader allowing assignment of a parent shader from which a shader can pull PSO descriptors and any required metadata for asynchronous shader cache warming. This enables fully asynchronous shader optimization, without runtime hitching, while also reducing runtime hitching for standard materials, by using PSO descriptors from default materials, ahead of rendering.

Further shader graph optimizations are likely also possible with this architecture. Certain scenes, such as Wanderer benefit significantly. Viewport performance for this scene is 2-3x faster on Apple-silicon based GPUs.

Authored by Apple: Michael Parkin-White

Ref T96261
Pull Request #104536
2023-02-14 21:51:03 +01:00
173a8f4ac9 GPU: Removes GPU_shader_get_builtin_ssbo
Simplify the API. Use hardcoded ssbo location instead.
2023-02-13 11:22:38 +01:00
158f87203e Cleanup: GPUShader: Reorganize GPU_shader.h to separate depecated API
This avoid confusion to what to use nowadays.
Also improves documentation.
2023-02-13 11:22:38 +01:00
f828ecf4ba GPU: Use same read back API as SSBOs
The GPU module has 2 different styles when reading back data from
GPU buffers. The SSBOs used a memcpy to copy the data to a
pre-allocated buffer. IndexBuf/VertBuf gave back a driver/platform
controlled pointer to the memory.

Readback is done for test cases returning mapped pointers is not safe.
For this reason we settled on using the same approach as the SSBO.
Copy the data to a caller pre-allocated buffer.

Reason why this API is currently changed is that the Vulkan API is more
strict on mapping/unmapping buffers that can lead to potential issues
down the road.

Pull Request #104571
2023-02-13 08:34:19 +01:00
91346755ce Cleanup: use '#' prefix for issues instead of 'T'
Match the convention from Gitea instead of Phabricator's T for tasks.
2023-02-12 14:56:05 +11:00
945d108ab8 GPU: Fix uninitialized variable which created asan warning / errors
This wasn't really a problem since these are set on first bind or creation.
The test `if (enabled_srgb && srgb_) {` was depending on that variable that
in certain case, might not have been initialized (because of lazy init).
2023-01-16 00:39:57 +01:00
e39ca9d1e3 Cleanup: use function style casts for integer types in C++
Also remove redundant parenthesis.
2023-01-03 11:12:51 +11:00
2652029f3b Cleanup: Clang tidy
Addressed almost all warnings except for replacing defines
with enums and variable assignment in if statements.
2022-12-29 12:01:32 -05:00
Hallam Roberts
a501a2dbff Images: add mirror extension type
This adds a new mirror image extension type for shaders and
geometry nodes (next to the existing repeat, extend and clip
options).

See D16432 for a more detailed explanation of `wrap_mirror`.

This also adds a new sampler flag `GPU_SAMPLER_MIRROR_REPEAT`.
It acts as a modifier to `GPU_SAMPLER_REPEAT`, so any `REPEAT`
flag must be set for the `MIRROR` flag to have an effect.

Differential Revision: https://developer.blender.org/D16432
2022-12-14 19:27:29 +01:00
6b8bb26c45 EEVEE: Port existing EEVEE shaders and generated materials to use GPUShaderCreateInfo.
Required by Metal backend for efficient shader compilation. EEVEE material
resource binding permutations now controlled via CreateInfo and selected
based on material options. Other existing CreateInfo's also modified to
ensure explicitness for depth-writing mode. Other missing bindings also
addressed to ensure full compliance with the Metal backend.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D16243
2022-12-08 21:12:19 +01:00
009f7de619 Cleanup: use better matching integer types for graphics interop handle
Ref D16042
2022-12-01 15:55:48 +01:00
Jason Fielder
b132e3b3ce Cycles: use GPU module for viewport display
To make GPU backends other than OpenGL work. Adds required pixel buffer and
fence objects to GPU module.

Authored by Apple: Michael Parkin-White

Ref T96261
Ref T92212

Reviewed By: fclem, brecht

Differential Revision: https://developer.blender.org/D16042
2022-12-01 15:55:48 +01:00
a9a5f7ce17 GPU: UniformBuf: Add GPU_uniformbuf_clear_to_zero
This allows clearing the entire buffer directly on GPU.
2022-11-15 20:16:25 +01:00
ff40b90f99 GPU: UniformBuffer: Add possibility to bind as SSBO
This way UBOs can be modified directly in shader just like VBOs and IBOs.
2022-11-15 14:41:38 +01:00
5db84d0ef1 GPU: State: Add GPU_BARRIER_UNIFORM
This allows to synchronise uniform buffer writes from compute shader
when an UBO is bound as SSBO.
2022-11-15 14:41:38 +01:00
5be3a68f58 Fix hair/curve drawing artifacts when workarounds are enabled.
Regression introduced by {rB601995c3b86986cf8f8e5b6e5a65bcfa7f8f2e32}.
Noticed by Heist project as they render final frames with workarounds enabled.

The mentioned patch introduces attaching VBO as textures. This used to be
done by the caller. The mechanism used a different order hence the VBO
could still be unbound when using. This cannot be solved inside the new
mechanism clearly so this patch will just bind when the buffer isn't bound
just before the drawing command is sent to the GPU driver.
2022-11-15 08:07:01 +01:00
66a166d236 GL: Make restart index consistent on older implementation
This prevents weird quirks where the implementation might skip
the ushort max index even in non-indexed draws.
2022-10-20 16:07:14 +02:00
ff157d7eba Fix incorrect shader state after shader interface creation
Use store-current-and-restore-previous OpenGL program in the OpenGL
Shader Interface. This is a better fix for the initial error, which
additionally solves interface artifacts when opening non-default
startyp files on macOS with AMD GPU.
2022-10-20 10:09:32 +02:00
3ac2f15a04 GL: Fix incorrect shader state after shader interface creation
The interface needs to bind the shaders for some parameter setup.
This program change wasn't reflected in the GPUContext.
This was then conflicting with the next shader bind if the next shader was
the same as the shader bound before the interface creation.

Setting the state to the correct shader ensures a rebind if needed.

Fix T101792 New hair curves do not render properly first time in EEVEE with motion blur enabled
2022-10-19 14:40:42 +02:00
331f850056 Cleanup: redundant parenthesis 2022-10-07 22:55:03 +11:00
97746129d5 Cleanup: replace UNUSED macro with commented args in C++ code
This is the conventional way of dealing with unused arguments in C++,
since it works on all compilers.

Regex find and replace: `UNUSED\((\w+)\)` -> `/*$1*/`
2022-10-03 17:38:16 -05:00
333e41eac6 Cleanup: replace C-style casts with functional casts for numeric types
Use function style casts in C++ headers & source.
2022-09-26 17:58:36 +10:00
0210c4df17 GPU: Disable SSBO support from commandline.
In heavy scenes containing many hairs/curves and volumetrics
using SSBO can overwrite the binding information of the volumetric
resolve shader. This has been detected during project Heist and is
only reproducable on NVIDIA platform.

This patch adds an debug option to disable SSBOs from the command
line to replace the --debug-gpu-force-workarounds that has been
used as a workaround on the render farm. Reason is that
force workarounds will also add other limitations as well (number
of texture binds for example)
2022-09-26 09:41:50 +02:00
f68cfd6bb0 Cleanup: replace C-style casts with functional casts for numeric types 2022-09-25 20:17:08 +10:00
c7b247a118 Cleanup: replace static_casts with functional casts for numeric types 2022-09-25 18:31:10 +10:00
c9e35c2ced Cleanup: remove redundant double parenthesis 2022-09-25 15:34:32 +10:00
cda2dc721d Cleanup: compiler warnings 2022-09-23 14:33:40 +10:00
Thomas Dinges
697b447c20 Metal: MTLContext implementation and immediate mode rendering support.
MTLContext provides functionality for command encoding, binding management and graphics device management. MTLImmediate provides simple draw enablement with dynamically encoded data. These draws utilise temporary scratch buffer memory to provide minimal bandwidth overhead during workload submission.

This patch also contains empty placeholders for MTLBatch and MTLDrawList to enable testing of first pixels on-screen without failure.

The Metal API also requires access to the GHOST_Context to ensure the same pre-initialized Metal GPU device is used by the viewport. Given the explicit nature of Metal, explicit control is also needed over presentation, to ensure correct work scheduling and rendering pipeline state.

Authored by Apple: Michael Parkin-White

Ref T96261

(The diff is based on 043f59cb3b)

Reviewed By: fclem

Differential Revision: https://developer.blender.org/D15953
2022-09-22 17:32:43 +02:00
1810b1e4c8 GL: Framebuffer: Add support for empty framebuffer (no attachments)
This allows to reduce the memory footprint of very large framebuffers if
there is no need for any attachment.
2022-09-17 10:17:47 +02:00
397e5c5526 GL: Require a minimum of 8 ssbo slot per shader stage
Otherwise we disable this feature. This is because some driver
does not support any vertex storage buffers but still support
8 ssbo in fragment shader.
2022-09-06 11:12:38 +02:00
Clément Foucault
65ad36f5fd DRWManager: New implementation.
This is a new implementation of the draw manager using modern
rendering practices and GPU driven culling.

This only ports features that are not considered deprecated or to be
removed.

The old DRW API is kept working along side this new one, and does not
interfeer with it. However this needed some more hacking inside the
draw_view_lib.glsl. At least the create info are well separated.

The reviewer might start by looking at `draw_pass_test.cc` to see the
API in usage.

Important files are `draw_pass.hh`, `draw_command.hh`,
`draw_command_shared.hh`.

In a nutshell (for a developper used to old DRW API):
- `DRWShadingGroups` are replaced by `Pass<T>::Sub`.
- Contrary to DRWShadingGroups, all commands recorded inside a pass or
   sub-pass (even binds / push_constant / uniforms) will be executed in order.
- All memory is managed per object (except for Sub-Pass which are managed
   by their parent pass) and not from draw manager pools. So passes "can"
   potentially be recorded once and submitted multiple time (but this is
   not really encouraged for now). The only implicit link is between resource
   lifetime and `ResourceHandles`
- Sub passes can be any level deep.
- IMPORTANT: All state propagate from sub pass to subpass. There is no
   state stack concept anymore. Ensure the correct render state is set before
   drawing anything using `Pass::state_set()`.
- The drawcalls now needs a `ResourceHandle` instead of an `Object *`.
   This is to remove any implicit dependency between `Pass` and `Manager`.
   This was a huge problem in old implementation since the manager did not
   know what to pull from the object. Now it is explicitly requested by the
   engine.
- The pases need to be submitted to a `draw::Manager` instance which can
   be retrieved using `DRW_manager_get()` (for now).

Internally:
- All object data are stored in contiguous storage buffers. Removing a lot
   of complexity in the pass submission.
- Draw calls are sorted and visibility tested on GPU. Making more modern
   culling and better instancing usage possible in the future.
- Unit Tests have been added for regression testing and avoid most API
   breakage.
- `draw::View` now contains culling data for all objects in the scene
   allowing caching for multiple views.
- Bounding box and sphere final setup is moved to GPU.
- Some global resources locations have been hardcoded to reduce complexity.

What is missing:
- ~~Workaround for lack of gl_BaseInstanceARB.~~ Done
- ~~Object Uniform Attributes.~~ Done (Not in this patch)
- Workaround for hardware supporting a maximum of 8 SSBO.

Reviewed By: jbakker

Differential Revision: https://developer.blender.org/D15817
2022-09-02 18:45:14 +02:00
Thomas Dinges
cc8ea6ac67 Metal: MTLShader and MTLShaderGenerator implementation.
Full support for translation and compilation of shaders in Metal, using
GPUShaderCreateInfo. Includes render pipeline state creation and management,
enabling all standard GPU viewport rendering features in Metal.

Authored by Apple: Michael Parkin-White, Marco Giordano

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D15563
2022-09-01 22:28:40 +02:00
Jason Fielder
ac07fb38a1 Metal: Minimum per-vertex stride, 3D texture size + Transform feedback GPUCapabilities expansion.
- Adding in compatibility paths to support minimum per-vertex strides for vertex formats. OpenGL supports a minimum stride of 1 byte, in Metal, this minimum stride is 4 bytes. Meaing a vertex format must be atleast 4-bytes in size.

- Replacing transform feedback compile-time check to conditional look-up, given TF is supported on macOS with Metal.

- 3D texture size safety check added as a general capability, rather than being in the gl backend only. Also required for Metal.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D14510
2022-09-01 22:18:02 +02:00
Jason Fielder
5f4409b02e Metal: MTLIndexBuf class implementation.
Implementation also contains a number of optimisations and feature enablements specific to the Metal API and Apple Silicon GPUs.

Ref T96261

Reviewed By: fclem

Maniphest Tasks: T96261

Differential Revision: https://developer.blender.org/D15369
2022-09-01 21:45:12 +02:00
Germano Cavalcante
6269d66da2 PyGPU: GPUShader: implementation of 'attrs_info_get' method
With the new `attrs_info_get` method, we can get information about
the attributes used in a `GPUShader` and thus have more freedom in the
automatic creation of `GPUVertFormat`s

Reviewed By: fclem, campbellbarton

Differential Revision: https://developer.blender.org/D15764
2022-09-01 08:25:55 -03:00
5a60535a20 GPUCapabilities: Add GPU_shader_draw_parameters_support
This checks for the availability of `gl_BaseInstanceARB` or equivalent.

Disabling for any workaround that disables shader_image_load_store_support
as a preventive measure.
2022-08-31 11:35:18 +02:00
4944167dee GPUBatch: Add multi_draw_indirect capability and indirect buffer offset
This is for completion and to be used by the new draw manager.
2022-08-30 22:26:11 +02:00
fe195f51d1 GPUStorageBuf: Add read() function to readback buffer data to host
This is not expected to be fast. This is only for inspecting the content
of the buffer for debugging or validation purpose.
2022-08-30 22:26:11 +02:00
28750bcf7e Cleanup: replace NULL with nullptr for C++ files 2022-08-28 20:52:28 +10:00
Loren Osborn
db46251209 Fix ubsan warnings about indexing into null pointers
Ref T99382

Differential Revision: https://developer.blender.org/D15390
2022-08-19 16:27:22 +02:00
e52fd904e8 Merge branch 'blender-v3.3-release' 2022-08-17 15:45:25 +10:00
95fd163074 Cleanup: spelling in comments 2022-08-17 15:43:17 +10:00
a296b8f694 GPU: replace GLEW with libepoxy
With libepoxy we can choose between EGL and GLX at runtime, as well as
dynamically open EGL and GLX libraries without linking to them.

This will make it possible to build with Wayland, EGL, GLVND support while
still running on systems that only have X11, GLX and libGL. It also paves
the way for headless rendering through EGL.

libepoxy is a new library dependency, and is included in the precompiled
libraries. GLEW is no longer a dependency, and WITH_SYSTEM_GLEW was removed.

Includes contributions by Brecht Van Lommel, Ray Molenkamp, Campbell Barton
and Sergey Sharybin.

Ref T76428

Differential Revision: https://developer.blender.org/D15291
2022-08-15 16:10:29 +02:00
f6639cc4fc DRW: DebugDraw: Port module to C++ and add GPU capabilities
This is a complete rewrite of the draw debug drawing module in C++.
It uses `GPUStorageBuf` to store the data to be drawn and use indirect
drawing. This makes it easier to do a mirror API for GPU shaders.

The C++ API class is exposed through `draw_debug.hh` and should be used
when possible in new code.

However, the debug drawing will not work for platform not yet supporting
`GPUStorageBuf`. Also keep in mind that this module must only be used
in debug build for performance and compatibility reasons.
2022-08-09 15:45:46 +02:00
2e4727e123 GL: Fix error messages missing end of line 2022-08-09 15:45:46 +02:00
04d43e8dbb GL: Remove lingering image binds
This updates image bind tracking to be the same as texture binds.
Adding a new bind flag to avoid conflict when the texture is used in
both slots.
Fixes a gl error in glBindImageTextures about invalid image binds.
2022-08-02 21:53:17 +02:00
d629402054 GL: Compute: Fix indirect compute barrier and unbind
This path is not used by any existing code so it isn't necessary to
backport.
2022-08-02 21:53:17 +02:00
1e5ab041d7 GPUBatch: Add GPU_batch_draw_indirect
This allows rendering a batch with parameters computed by the GPU.

Contains GL backend implementation.
2022-08-02 21:53:17 +02:00