MTLFramebuffer's viewport was not correctly updated when
updating attachments. Behaviour modified to be consistent
with OpenGL.
Authored by Apple: Michael Parkin-White
Ref #96261
Pull Request: blender/blender#105529
Previously, UBO bind locations were linearly incremented and
relied on the correct uniform location being queried. This fix
is a future requirement for EEVEE next, however, pulling forward
due to Issue #105280 highlighting a possible flaw with expected
uniform locations.
Authored by Apple: Michael Parkin-White
Ref #96261
Pull Request #105311
Similar to recent issues with gl_shader_interface,
ShaderInput lists need to be sorted to ensure correct
and efficient uniform lookup by name.
Authored by Apple: Michael Parkin-White
Ref #96261
Pull Request #105239
Metal LineLoop emulation path does not correctly apply
when using SSBO vertex fetch mode alongside 3D line
rendering.
Patch moves line emulation above SSBO
vertex fetch setup to ensure the correct emulation
parameters are passed to the shader.
Authored by Apple: Michael Parkin-White
Ref #96261
Pull Request #105142
Resolves issue with nearest filtering on UI Icons. Note that as
Metal does not support LOD bias as a parameter on a sampler
object, the original code has been modified to perform LOD
biasing at the shader level.
As GPU_SAMPLER_ICON is not widely used, it is more
efficient to apply directly to the affected shaders, rather
than workaround passing in the sampler LOD bias as a
separate value e.g. uniform or push constant.
Original PR feedback addressed to also refactor ICON
shaders to use consistent style for single and multi
Icon rendering.
Authored by Apple: Michael Parkin-White
Ref #96261
Pull Request #105145
This remove default casses from the `switch` statements to catch where
the missing cases are.
Uncomment unimplemented cases for the sake of completeness. Improving the
overall API.
This make the format conversion lists exhaustive and documented.
This replace `validate_data_format_mtl` by the common version as they
don't differ at all now.
For every other texture types this is expected to be implicitly
`GPU_DATA_FLOAT`. There is only one case where this is not the case.
I believe this was previously needed because the data type was
conditionning the texture creation. This is not the case anymore.
The check was triggering the 'this' pointer cannot be null in
well-defined C++ code
We do not check for this pointer in any other areas. If it is
needed due to possible opaque pointer cast to the check prior
to the cast.
Pull Request #104974
This reverts commit 19222627c6.
Something went wrong here, seems like this commit merged the main branch
into the release branch, which should never be done.
Certain material node graphs can be very expensive to run. This feature aims to produce secondary GPUPass shaders within a GPUMaterial which provide optimal runtime performance. Such optimizations include baking constant data into the shader source directly, allowing the compiler to propogate constants and perform aggressive optimization upfront.
As optimizations can result in reduction of shader editor and animation interactivity, optimized pass generation and compilation is deferred until all outstanding compilations have completed. Optimization is also delayed util a material has remained unmodified for a set period of time, to reduce excessive compilation. The original variant of the material shader is kept to maintain interactivity.
Also adding a new concept to gpu::Shader allowing assignment of a parent shader from which a shader can pull PSO descriptors and any required metadata for asynchronous shader cache warming. This enables fully asynchronous shader optimization, without runtime hitching, while also reducing runtime hitching for standard materials, by using PSO descriptors from default materials, ahead of rendering.
Further shader graph optimizations are likely also possible with this architecture. Certain scenes, such as Wanderer benefit significantly. Viewport performance for this scene is 2-3x faster on Apple-silicon based GPUs.
Authored by Apple: Michael Parkin-White
Ref T96261
Pull Request #104536
The GPU module has 2 different styles when reading back data from
GPU buffers. The SSBOs used a memcpy to copy the data to a
pre-allocated buffer. IndexBuf/VertBuf gave back a driver/platform
controlled pointer to the memory.
Readback is done for test cases returning mapped pointers is not safe.
For this reason we settled on using the same approach as the SSBO.
Copy the data to a caller pre-allocated buffer.
Reason why this API is currently changed is that the Vulkan API is more
strict on mapping/unmapping buffers that can lead to potential issues
down the road.
Pull Request #104571
Reduces the GLSL to MSL translation stage of shader compilation from 120 ms to 5 ms for complex EEVEE materials. This manifests in faster overall compilations, and faster cache hits for secondary compilations, as the MSL variant is needed as a key.
Startup time is also improved for both first-run and second-run. Note that this change does not affect shader compilation times within the Metal API.
Also disables shader output to disk
Authored by Apple: Michael Parkin-White
Ref T96261
Depends on D16990
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D17033
Replace texelFetch calls with a texture point-sample rather than a textureRead call. This increases texture cache utilisation when mixing between sampled calls and reads. Bounds checking can also be removed from these functions, reducing instruction count and branch divergence, as the sampler routine handles range clamping.
Authored by Apple: Michael Parkin-White
Ref T96261
Depends on D16923
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D17021
Also minor changes in comments:
- Reference BLENDER_HISTORY_FILE instead of the literal file-name
(simplifies looking up usage).
- Use usernames in tags, as noted in code-style.
Resolve an issue where released buffers were returned to the reusable memory pool before GPU work associated with these buffers had been encoded. Usually release of memory pools is dependent on successful completion of GPU work via command buffer callbacks. However, if the pool refresh operation occurs between encoding of work and submission, buffer ref-count is prematurely decremented.
Patch also ensures safe buffer free lists are only flushed once a set number of buffers have been used. This reduces overhead of small and frequent flushes, without raising the memory ceiling significantly.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D17118
Metal backend does not support primtiive restart for point primtiives. Hence strip_restart_indices removes restart indices by swapping them to the end of the index buffer and reducing the length.
An edge-case existed where all indices within the index buffer were restarts and no valid swap-index would be found, resulting in a buffer underflow.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D17088
This patch adds support for compilation and execution of GLSL compute shaders. This, along with a few systematic changes and fixes, enable realtime compositor functionality with the Metal backend on macOS. A number of GLSL source modifications have been made to add the required level of type explicitness, allowing all compilations to succeed.
GLSL Compute shader compilation follows a similar path to Vertex/Fragment translation, with added support for shader atomics, shared memory blocks and barriers.
Texture flags have also been updated to ensure correct read/write specification for textures used within the compositor pipeline. GPU command submission changes have also been made in the high level path, when Metal is used, to address command buffer time-outs caused by certain expensive compute shaders.
Authored by Apple: Michael Parkin-White
Ref T96261
Ref T99210
Reviewed By: fclem
Maniphest Tasks: T99210, T96261
Differential Revision: https://developer.blender.org/D16990
Improve handling for cases where maximum in-flight command buffer count is exceeded. This can occur during light-baking operations. Ensures the application handles this gracefully and also improves workload pipelining by situationally stalling until GPU work has completed, if too much work is queued up.
This may have a tangible benefit for T103742 by ensuring Blender does not queue up too much GPU work.
Authored by Apple: Michael Parkin-White
Ref T96261
Ref T103742
Depends on D17018
Reviewed By: fclem
Maniphest Tasks: T103742, T96261
Differential Revision: https://developer.blender.org/D17019
AMD GPUs do not appear to produce consistent results with other GPUs when using textureGather in the Metal backend. Disabling for now to ensure correct function of outline rendering.
This may require an additional sub-pixel offset in the texture sampling calls, to achieve correct behaviour.
Authored by Apple: Michael Parkin-White
Ref T103412
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T103412, T96261
Differential Revision: https://developer.blender.org/D16934
This patch fixes an issue where Blender 3.5 alpha with the Metal GPU backend enabled on Japanese macOS fails to compile shaders and crashes on startup.
In a Japanese environment, `defaultCStringEncoding` is the legacy MacJapanese encoding, and it erroneously converts backslashes (0x5c) to Yen symbols (¥).
Therefore, Metal shader compile fails with the following log and Blender crashes.
```
2022-12-29 13:50:10.200 Blender[13404:246707] Compile Error - Metal Shader Library (Stage: 0), error Error Domain=MTLLibraryErrorDomain Code=3 "program_source:225:74: error: non-ASCII characters are not allowed outside of literals and identifiers
template<typename T, access A = access::sample> struct STRUCT_NAME { ¥
^
program_source:226:14: error: no template named 'TEX_TYPE'
thread TEX_TYPE<T, A> *texture; ¥
^
program_source:226:39: error: non-ASCII characters are not allowed outside of literals and identifiers
thread TEX_TYPE<T, A> *texture; ¥
^
program_source:227:29: error: non-ASCII characters are not allowed outside of literals and identifiers
thread sampler *samp; ¥
^
...
```
We can use `stringWithUTF8String` instead.
Reviewed By: fclem, MichaelPW
Differential Revision: https://developer.blender.org/D16881
`batch_for_shader` is an utility function that creates the correct
vertex buffer based on the given shader. In the shader interface
the `attr_types_` contains the GPUType for each location in the
vertex buffer.
When using Metal, the `attr_types_` was never updated, resulting
in using incorrect or non-existing data types. This patch fixes
this by updating the `attr_types_` when building the shader
interface.
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D17042
Implementation didn't count the string terminator when allocating
memory to store `msl_patch_default`. The string terminator could
be overwritted by other memory adding some undefined behavior.
OpenGL is deprecated by Apple and triggers a warning when used. The goal
is that OpenGL is replaced by Metal backend, but we are not there yet.
To improve tracability of new warnings we hide deprecation warnings
when the GHOST_ContextCGL.h file is included.
NOTE: This change silences other deprecation warnings as well.
Staging texture update copied over the entire texture, rather than just the region of the texture which had been updated. Also added early-exit for cases where the net texture update extent was zero, as this was causing validation failures.
Authored by Apple: Michael Parkin-White
Ref T103658
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T103658, T96261
Differential Revision: https://developer.blender.org/D16924
First binding of a framebuffer lead to an incorrect SRGB conversion state being applied, as attachments, where presence of SRGB is determined, were processed after the SRGB check rather than before.
This DIFF also cleans up SRGB naming conventions and caching of fallback non-srgb texture view, for use when SRGB mode is disabled.
Authored by Apple: Michael Parkin-White
Ref T103399
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T103399, T96261
Differential Revision: https://developer.blender.org/D16907