Commit Graph

12463 Commits

Author SHA1 Message Date
dbca0cc9d5 Fix crash on exit under Wayland
Order of free error from [0] caused the timer manager
to be freed before the timer.

[0]: 7de1a4d1d8
2023-02-07 15:12:05 +11:00
a99022e22d Cleanup: spelling in comments 2023-02-07 14:17:01 +11:00
6dcfb6df9c Cycles: Abstract host memory fallback for GPU devices
Host memory fallback in CUDA and HIP devices is almost identical.
We remove duplicated code and create a shared generic version that
other devices (oneAPI) will be able to use.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17173
2023-02-06 22:19:32 +01:00
Michael Jones
2d994de77c Cycles: MetalRT optimisation for subsurface intersection queries
This patch optimises subsurface intersection queries on MetalRT. Currently intersect_local traverses from the scene root, retrospectively discarding all non-local hits. Using a lookup of bottom level acceleration structures, we can explicitly query only the relevant instance. On M1 Max, with MetalRT selected, this can give a render speedup of 15-20% for scenes like Monster which make heavy use of subsurface scattering.

Patch authored by Marco Giordano.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17153
2023-02-06 19:12:29 +00:00
773a36d2f8 Fix Cycles OneAPI build error after recent changes 2023-02-06 15:36:49 +01:00
f2538c7173 Fix T104335: MNEE + OptiX OSL results in illegal address error
The OptiX pipeline created for OSL was missing sufficient continuation
stack to handle the MNEE ray generation program.
2023-02-06 15:06:52 +01:00
9ad3a85f8b Fix Cycles GPU binaries build error after recent changes for Metal 2023-02-06 13:17:57 +01:00
Michael Jones
654e1e901b Cycles: Use local atomics for faster shader sorting (enabled on Metal)
This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high".

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16909
2023-02-06 11:18:26 +00:00
Michael Jones
46c9f7702a Cycles: Enable MetalRT opt-in for AMD/Navi2 GPUs
Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17043
2023-02-06 11:14:11 +00:00
Michael Jones
be0912a402 Cycles: Prevent use of both AMD and Intel Metal devices at same time
This patch removes the option to select both AMD and Intel GPUs on system that have both. Currently both devices will be selected by default which results in crashes and other poorly understood behaviour. This patch adds precedence for using any discrete AMD GPU over an integrated Intel one. This can be overridden with CYCLES_METAL_FORCE_INTEL.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17166
2023-02-06 11:13:33 +00:00
Michael Jones
0a3df611e7 Fix T103393: Cycles: Undefine __LIGHT_TREE__ on Metal/AMD to fix perf
This patch fixes T103393 by undefining `__LIGHT_TREE__` on Metal/AMD as it has an unexpected & major impact on performance even when light trees are not in use.

Patch authored by Prakash Kamliya.

Reviewed By: brecht

Maniphest Tasks: T103393

Differential Revision: https://developer.blender.org/D17167
2023-02-06 11:12:34 +00:00
329eeacc66 Cleanup: Cycles: Remove isotropic microfacet closure setup functions
Turns out these are 100% redundant, so get rid of them.
2023-02-06 04:26:36 +01:00
731c3efd97 Cleanup: format 2023-02-06 12:32:45 +11:00
7de1a4d1d8 Fix GHOST/Wayland thread-unsafe timer-manager manipulation
Mutex locks for manipulating GHOST_System::m_timerManager from
GHOST_SystemWayland relied on WAYLAND being the only user of the
timer-manager.

This isn't the case as timers are fired from
`GHOST_System::dispatchEvents`.

Resolve by using a separate timer-manager for wayland key-repeat timers.
2023-02-06 12:25:04 +11:00
d3949a4fdb Fix GHOST/Wayland thread-unsafe key-repeat timer checks
Resolve a thread safety issue reported by valgrind's helgrind checker,
although I wasn't able to redo the error in practice.

NULL check on the key-repeat timer also needs to lock, otherwise it's
possible the timer is set in another thread before the lock is acquired.

Now all key-repeat timer access which may run from a thread
locks the timer mutex before any checks or timer manipulation.
2023-02-06 12:25:04 +11:00
73000c792d Cycles: Reorganize Fresnel handling in Microfacet closures
This is both a cleanup and a preparation for the Principled v2 changes.
Notable changes:
- Clearcoat weight is now folded into the closure weight, there's no reason
  to track this separately.
- There's a general-purpose helper for computing a Closure's albedo, which is
  currently used by the denoising albedo and diffuse/gloss/transmission color
  passes.
- The d/g/t color passes didn't account for closure albedo before, this means
  that e.g. metallic shaders with Principled v2 now have their color texture
  included in the glossy color pass. Also fixes T104041 (sheen albedo).
- Instead of precomputing and storing the albedo during shader setup, compute
  it when needed. This is technically redundant since we still need to compute
  it on shader setup to adjust the sample weight, but the operation is cheap
  enough that freeing up the storage seems worth it.
- Future changes (Principled v2) are easier to integrate since the Fresnel
  handling isn't all over the place anymore.
- Fresnel handling in the Multiscattering GGX code is still ugly, but since
  removing that entirely is the next step, putting effort into cleaning it up
  doesn't seem worth it.
- Apart from the d/g/t color passes, no changes to render results are expected.

Differential Revision: https://developer.blender.org/D17101
2023-02-03 21:03:48 +01:00
b454416927 Cycles: add non-uniform scaling to spot light size
Cycles ignores the size of spot lights, therefore the illuminated area doesn't match the gizmo. This patch resolves this discrepancy.
| Before (Cycles) | After (Cycles) | Eevee
|{F14200605}|{F14200595}|{F14200600}|
This is done by scaling the ray direction by the size of the cone. The implementation of `spot_light_attenuation()` in `spot.h` matches `spot_attenuation()` in `lights_lib.glsl`.
**Test file**:
{F14200728}

Differential Revision: https://developer.blender.org/D17129
2023-02-03 18:51:14 +01:00
ab3e1809b6 Vulkan: Select graphics queue with compute capabilities.
There might be cases that the graphics queue and compute
queue are separated, but that would be something that we
will solve in the future.
2023-02-03 10:16:01 +01:00
a45a881534 Spelling: familly -> family.
Fix spelling mistake originated from tmp-vulkan branch.
2023-02-03 10:16:01 +01:00
266d8de687 Cleanup: spelling in comments 2023-02-03 12:41:01 +11:00
fa9fc59b56 Fix T104240: OptiX OSL texture loading broken with displacement
The image manager used to handle OSL textures on the GPU by
default loads images after displacement is evaluated. This is a
problem when the displacement shader uses any textures, hence
why the geometry manager already makes the image manager
load any images used in the displacement shader graph early
(`GeometryManager::device_update_displacement_images`).
This only handled Cycles image nodes however, not OSL nodes, so
if any `texture` calls were made in OSL those would be missed and
therefore crash when accessed on the GPU. Unfortunately it is not
simple to determine which textures referenced by OSL are needed
for displacement, so the solution for now is to simply load all of
them early if true displacement is used.
This patch also fixes the result of the displacement shader not
being used properly in OptiX.

Maniphest Tasks: T104240

Differential Revision: https://developer.blender.org/D17162
2023-01-31 16:41:00 +01:00
79c82fc1c5 Cleanup: trailing space 2023-01-31 15:49:04 +11:00
27b4916b1a Cleanup: spelling in comments
Also minor changes in comments:
- Reference BLENDER_HISTORY_FILE instead of the literal file-name
  (simplifies looking up usage).
- Use usernames in tags, as noted in code-style.
2023-01-31 14:22:23 +11:00
129093fbce Cycles: Fix crash when rendering with OSL on multiple GPUs
The `MultiDevice` implementation of `get_cpu_osl_memory` returns a
nullptr when there is no CPU device in the mix. As such access to that
crashed in `update_osl_globals`. But that only updates maps that are not
currently used on the GPU anyway, so can just skip that when the CPU
is not used for rendering.

Maniphest Tasks: T104216
2023-01-30 19:40:22 +01:00
a36c1cabce Vulkan: Changes to CMake config.
Paths to vulkan libraries, paths and related components were
hardcoded in the platform cmake file. This patch separates
this by using adding CMake modules for Vulkan and ShaderC.

This change has only been applied to the macOs configuration as
that is currently our main platform for development. Other platforms
will be added during the development of the Vulkan back-end.
2023-01-30 12:04:44 +01:00
4635dd6aed Fix T104157: Deleting an active OSL node causes issues
Removing all OSL script nodes from the shader graph would cause that
graph to no longer report it using `KERNEL_FEATURE_SHADER_RAYTRACE`
via `ShaderManager::get_graph_kernel_features`, but the shader object
itself still would have the `has_surface_raytrace` field set.
This caused kernels to be reloaded without shader raytracing support, but
later the `DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE`
kernel would still be invoked since the shader continued to report it
requiring that through the `SD_HAS_RAYTRACE` flag set because of
`has_surface_raytrace`.
Fix that by ensuring `has_surface_raytrace` is reset on every shader update,
so that when all OSL script nodes are deleted it is set to false, and only
stays true when there are still OSL script nodes (or other nodes using it).

Maniphest Tasks: T104157

Differential Revision: https://developer.blender.org/D17140
2023-01-27 16:14:25 +01:00
b898e00edc Cleanup: remove unused KernelGlobals in microfacet BSDF 2023-01-25 11:26:51 +01:00
adcb0edca0 Cleanup: clang-tidy, replace defines with enum, redundant parenthesis 2023-01-25 11:53:06 +11:00
e308b891c8 Cycles: Use faster and exact GGX VNDF sampling algorithm
Based on "Sampling the GGX Distribution of Visible Normals" by Eric Heitz
(https://jcgt.org/published/0007/04/01/).

Also, this removes the lambdaI computation from the Beckmann sampling code and
just recomputes it below. We already need to recompute for two other cases
(GGX and clearcoat), so this makes the code more consistent.

In terms of performance, I don't expect a notable impact since the earlier
computation also was non-trivial, and while it probably was slightly more
accurate, I'd argue that being consistent between evaluation and sampling is
more important than absolute numerical accuracy anyways.

Differential Revision: https://developer.blender.org/D17100
2023-01-24 17:59:29 +01:00
fdcb55b285 Cycles: Switch microfacet code to non-separable shadowing-masking term
This gives closer results to what I've seen in papers and other renderers when
using the code to precompute albedo later (to replace MultiGGX).

It's usually a tiny difference, the only case where I've seen it matter is
in the `shader/node_group_float.blend` test - but that's a (single-scatter) GGX
closure with 0.9 roughness, so it's not too surprising. In any case, the new
result looks closer to Eevee, so that's good I guess.

Differential Revision: https://developer.blender.org/D17099
2023-01-24 17:59:29 +01:00
ce25e3e581 Cycles: Cleanup: Add general-purpose conversion between sin and cos 2023-01-24 17:59:29 +01:00
1c90f8209d Cycles: fix rendering with Nishita Sky Texture on Intel Arc GPUs
Speckles and missing lights were experienced in scenes with Nishita Sky
Texture and a Sun Size smaller than 1.5°, such as in Lone Monk and Attic
scenes.
Increasing the precision of cosf fixes it.
2023-01-24 09:58:22 +01:00
8afcecdf1f Cycles: update Intel Graphics compiler to 101.4032 on Windows
A noticeable (>5%) performance regression in oneAPI backend came with
a501a2dbff. Updating to latest graphics
compiler from driver 101.4032 fixes it.

I've tested it with current min-supported drivers and it runs well but
since compatibility of graphics compiler with older drivers isn't
guaranteed, I'm also bumping the min-supported driver versions.

If end-users consider latest drivers too fresh to switch to (version
isn't released as stable on Linux as of today but should be before
Blender 3.5 release), CYCLES_ONEAPI_ALL_DEVICES=1 env variable can be
used.

Intel Graphics Compiler on Linux will be updated in a later commit
so we can then close D16984.

Reviewed By: sergey, LazyDodo
2023-01-23 19:36:34 +01:00
Jason Fielder
84c25fdcaa Metal: Improve command buffer handling and workload scheduling.
Improve handling for cases where maximum in-flight command buffer count is exceeded. This can occur during light-baking operations. Ensures the application handles this gracefully and also improves workload pipelining by situationally stalling until GPU work has completed, if too much work is queued up.

This may have a tangible benefit for T103742 by ensuring Blender does not queue up too much GPU work.

Authored by Apple: Michael Parkin-White

Ref T96261
Ref T103742
Depends on D17018

Reviewed By: fclem

Maniphest Tasks: T103742, T96261

Differential Revision: https://developer.blender.org/D17019
2023-01-23 17:47:21 +01:00
8e56ded86d Cycles: temporarily disable AMD Vega GPU rendering due to compiler bug
To make daily builds pass while we figure this out.

Ref T104097
2023-01-23 17:30:12 +01:00
537db96fb7 GHOST/NDOF: don't send button events when there is no active window
NDOF events without an active window were ignored and printed
warnings in the console.
2023-01-22 21:06:10 +11:00
cbd15d387f GHOST/Wayland: don't send activate/deactivate on pointer enter/leave
This isn't correct as window activation is handled separately
from the cursor entering/leaving a window.

This would call de-activate when the cursor moved outside the window
even though the window remained focused.

Rely on focus changes which already handle activate/deactivate events.
2023-01-21 23:09:22 +11:00
37dfce550f Fix Cycles CUDA compiler warning with if constexpr
This is a C++17 feature, compiler should be able to figure this out
without the hint.
2023-01-20 20:31:40 +01:00
f31f7e3ef0 Cleanup: Remove unused light_sample_is_light() function.
This also fixes compile warnings on MSVC.
2023-01-20 17:36:48 +01:00
721bd5e6cf Fix invalid swapBuffer calls & outdated window decorations on Wayland
Swap-buffers was being deferred (to prevent it being called
from the event handling thread) however when it was called the
OpenGL context might not be active (especially with multiple windows).

Moving the cursor between windows made eglSwapBuffers report:
EGL Error (0x300D): EGL_BAD_SURFACE.

Resolve this by removing swapBuffer calls and instead add a
GHOST_kEventWindowUpdateDecor event intended for redrawing
client-side-decoration.

Besides the warning, this results an error with LIBDECOR window frames
not redrawing when a window became inactive.
2023-01-20 18:05:52 +11:00
844cca9984 Cleanup: spelling in comments 2023-01-20 15:19:32 +11:00
05bdef7ce6 Fix T103094: Cycles ignores small suns in Nishita sky
The background evaluation samples the sky discretely, so if the sun is
too small, it can be missed in the evaluation. To solve this, the sun is
ignored during the background evaluation and its contribution is
computed separately.
2023-01-19 18:31:54 -06:00
fe552bf236 Cleanup: make format 2023-01-19 22:48:05 +01:00
670b3c5013 Cleanup: compiler warnings 2023-01-19 22:48:05 +01:00
f71bfe4655 Fix anisotropic Beckmann regression test failing on Metal
The lookup table method on CPU and the numerical root finding method on
GPU give quite different results. This commit deletes the Beckmann lookup
table and uses numerical root finding on all devices. For the numerical
root finding, a combined bisection-Newton method with precision control
is used.

Differential Revision: https://developer.blender.org/D17050
2023-01-19 20:12:05 +01:00
9066f2e043 Cycles: Add support for OSL texture intrinsic on the GPU
This makes it possible to use `texture` and `texture3d` in custom
OSL shaders with a constant image file name as argument on the
GPU, where previously texturing was only possible through Cycles
nodes.
For constant file name arguments, OSL calls
`OSL::RendererServices::get_texture_handle()` with the file name
string to convert it into an opaque handle for use on the GPU.
That is now used to load the respective image file using the Cycles
image manager and generate a SVM handle that can be used on
the GPU. Some care is necessary as the renderer services class is
shared across multiple Cycles instances, whereas the Cycles image
manager is local to each.

Maniphest Tasks: T101222

Differential Revision: https://developer.blender.org/D17032
2023-01-19 19:14:48 +01:00
Michael Jones
e270a198a5 Cycles: Markup to disable specialisation of kernel data fields (Metal)
This patch adds markup to specify that certain kernel data constants should not be specialised. Currently it is used for `tabulated_sobol_sequence_size` and `sobol_index_mask` which change frequently based on the aa sample count, trash the shader cache, and have little bearing on performance.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16968
2023-01-19 17:57:42 +00:00
Michael Jones
08b3426df9 Cycles: Occupancy tuning for new higher end M2 machines
This patch adds occupancy tuning for the newly announced high-end M2 machines, giving 10-15% render speedup over a pre-tuned build.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17037
2023-01-19 17:56:40 +00:00
9b7c2cca3d Refactor: replace bool beckmann with enum MicrofacetType for readability
Differential Revision: https://developer.blender.org/D17044
2023-01-19 15:55:07 +01:00
320757bc61 Refactor microfacet BSDF to reduce repetition 2023-01-19 12:07:53 +01:00