Metal: Improve AMD EEVEE Performance #104743

Closed
Jason Fielder wants to merge 3 commits from Jason-Fielder/blender:MetalAMDPerformanceEEVEE_2 into main

When changing the target branch, be careful to rebase the branch in your fork to match. See documentation.

3 Commits

Author SHA1 Message Date
Michael Parkin-White 17bb51b413 Remove prototype function and ensure formatting. 2023-02-16 10:41:46 +00:00
Michael Parkin-White dc42ce0273 Refactor closure_eval to allow skipping SSR. 2023-02-15 17:10:46 +00:00
Michael Parkin-White 184345e885 Metal: Improve AMD EEVEE Performance
Complex EEVEE nodegraphs, particularly those combining multiple principledBSDF shader nodes have a tendancy to require a large number of simultaneous live registers due to function call depth. In some instances, this causes substantial performance drop and corruption if the stack gets too large.

To mitigate this, splitting calls to closure_eval such that only a single individual closure is evaluated in each call reduces the number of live registers required. This is preferred over using compound closure evaluation functions which require a large amount of in-flight data.

Note that this is generally not more optimal, if the stack does not spill, as there is an increased instruction count. The specific trade-off depends on the exact architecture in question. Hence, this is limited to AMD GPUs.

Authored by Apple: Michael Parkin-White

Ref T96261
2023-02-13 15:47:21 +00:00