Commit Graph

151 Commits

Author SHA1 Message Date
Michael Jones
2d994de77c Cycles: MetalRT optimisation for subsurface intersection queries
This patch optimises subsurface intersection queries on MetalRT. Currently intersect_local traverses from the scene root, retrospectively discarding all non-local hits. Using a lookup of bottom level acceleration structures, we can explicitly query only the relevant instance. On M1 Max, with MetalRT selected, this can give a render speedup of 15-20% for scenes like Monster which make heavy use of subsurface scattering.

Patch authored by Marco Giordano.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17153
2023-02-06 19:12:29 +00:00
773a36d2f8 Fix Cycles OneAPI build error after recent changes 2023-02-06 15:36:49 +01:00
9ad3a85f8b Fix Cycles GPU binaries build error after recent changes for Metal 2023-02-06 13:17:57 +01:00
Michael Jones
654e1e901b Cycles: Use local atomics for faster shader sorting (enabled on Metal)
This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high".

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16909
2023-02-06 11:18:26 +00:00
79c82fc1c5 Cleanup: trailing space 2023-01-31 15:49:04 +11:00
27b4916b1a Cleanup: spelling in comments
Also minor changes in comments:
- Reference BLENDER_HISTORY_FILE instead of the literal file-name
  (simplifies looking up usage).
- Use usernames in tags, as noted in code-style.
2023-01-31 14:22:23 +11:00
1c90f8209d Cycles: fix rendering with Nishita Sky Texture on Intel Arc GPUs
Speckles and missing lights were experienced in scenes with Nishita Sky
Texture and a Sun Size smaller than 1.5°, such as in Lone Monk and Attic
scenes.
Increasing the precision of cosf fixes it.
2023-01-24 09:58:22 +01:00
a84a8a528d Cycles: remove SSE3 and AVX kernel optimization levels
While keeping SSE2, SSE4.1 and AVX2. This does not affect hardware support, it
only slightly reduces performance for some older CPUs.

To reduce maintenance cost and improve compile times.

Differential Revision: https://developer.blender.org/D16978
2023-01-16 17:53:36 +01:00
858fffc2df Cycles: oneAPI: add support for SYCL host task
This functionality is related only to debugging of SYCL implementation
via single-threaded CPU execution and is disabled by default.
Host device has been deprecated in SYCL 2020 spec and we removed it
in 305b92e05f.
Since this is still very useful for debugging, we're restoring a
similar functionality here through SYCL 2020 Host Task.
2023-01-03 20:47:24 +01:00
Hallam Roberts
a501a2dbff Images: add mirror extension type
This adds a new mirror image extension type for shaders and
geometry nodes (next to the existing repeat, extend and clip
options).

See D16432 for a more detailed explanation of `wrap_mirror`.

This also adds a new sampler flag `GPU_SAMPLER_MIRROR_REPEAT`.
It acts as a modifier to `GPU_SAMPLER_REPEAT`, so any `REPEAT`
flag must be set for the `MIRROR` flag to have an effect.

Differential Revision: https://developer.blender.org/D16432
2022-12-14 19:27:29 +01:00
222b64fcdc Fix Cycles CUDA crash when building kernels without optimizations (for debug)
In this case the blocksize may not the one we requested, which was assumed to be
the case. Instead get the effective block size from the compiler as was already
done for Metal and OneAPI.
2022-11-30 21:46:17 +01:00
b0e2e45496 Cycles: Enable MetalRT pointclouds & other fixes
Code authored by Marco Giordano.

This fixes pointcloud rendering on MetalRT and some other subtle MetalRT bugs:
- Incorrect kernel hashing
- Missing specialisation constants
- Incorrect visibility filtering
- Missing null pointer check

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16499
2022-11-14 16:39:18 +00:00
e6b38deb9d Cycles: Add basic support for using OSL with OptiX
This patch  generalizes the OSL support in Cycles to include GPU
device types and adds an implementation for that in the OptiX
device. There are some caveats still, including simplified texturing
due to lack of OIIO on the GPU and a few missing OSL intrinsics.

Note that this is incomplete and missing an update to the OSL
library before being enabled! The implementation is already
committed now to simplify further development.

Maniphest Tasks: T101222

Differential Revision: https://developer.blender.org/D15902
2022-11-09 15:30:21 +01:00
Brecht Van Lommel
e1b3d91127 Refactor: replace Cycles sse/avx types by vectorized float4/int4/float8/int8
The distinction existed for legacy reasons, to easily port of Embree
intersection code without affecting the main vector types. However we are now
using SIMD for these types as well, so no good reason to keep the distinction.

Also more consistently pass these vector types by value in inline functions.
Previously it was partially changed for functions used by Metal to avoid having
to add address space qualifiers, simple to do it everywhere.

Also removes function declarations for vector math headers, serves no real
purpose.

Differential Revision: https://developer.blender.org/D16146
2022-11-08 12:28:40 +01:00
305b92e05f Cycles: oneAPI: remove use of SYCL host device
Host device is deprecated in SYCL 2020 spec, cpu device or standard C++
should be used instead.
2022-10-21 15:36:48 +02:00
e2a93e9c7c Fix T94136: Cycles: No Hair Shadows with Transparent BSDF 2022-10-20 04:47:21 +02:00
Morteza Mostajab
e6902d19a0 Cycles: Allow Intel GPUs under Metal
Known Issues:
- Command buffer failures when using binary archives (binary archives is disabled for Intel GPUs as a workaround)
- Wrong texture sampler being applied (to be addressed in the future)

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D16253
2022-10-19 17:09:38 +01:00
5bfce9a822 Cycles: oneAPI: preload kernels only when not using prebuilt binaries
sycl::build triggers compilation even if prebuilt binaries are
available, we'll have to find a better way in this case.
2022-10-19 16:42:10 +02:00
2943997d2a Cycles: oneAPI: include sycl/sycl.hpp instead of CL/sycl.hpp
Since SYCL 2020 API, sycl/sycl.hpp is the way.
2022-10-19 16:42:10 +02:00
58324f0c86 Cycles: oneAPI: Make test kernel more representative
Test kernel will now test functionalities related to kernel execution
with USM memory allocations instead of with SYCL buffers and accessors
as these aren't currently used in the backend.
2022-10-14 11:22:11 +02:00
82a5790d2a Cycles: oneAPI: Trigger compilation of used kernels only
JIT compilation of oneAPI kernels now happens during load stage
and proper message gets shown in the GUI during compilation.
Also, this implementation skips kernels that aren't needed for
the used scene, reducing overall (re)compilation time.
2022-10-10 16:38:11 +02:00
7eeeaec6da Cycles: use direct linking for oneAPI backend
This is a minimal set of changes, allowing a lot of cleanup that can
happen afterward as it allows sycl method and objects to be used outside
of kernel.cpp.

Reviewed By: brecht, sergey

Differential Revision: https://developer.blender.org/D15397
2022-10-07 09:50:05 +02:00
Michael Jones
2b88ee50fb Cycles: Tweak inlining policy on Metal
This patch optimises the Metal inlining policy. It gives a small speedup (2-3% on M1 Max) with no notable compilation slowdown vs what is already in master. Previously noted compilation slowdowns (as reported in T100102) were caused by forcing inlining for `ccl_device`, but we get better rendering perf by relying on compiler heuristics in these cases.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16081
2022-09-27 17:01:28 +01:00
Sebastian Herhoz
75a6d3abf7 Cycles: add Path Guiding on CPU through Intel OpenPGL
This adds path guiding features into Cycles by integrating Intel's Open Path
Guiding Library. It can be enabled in the Sampling > Path Guiding panel in the
render properties.

This feature helps reduce noise in scenes where finding a path to light is
difficult for regular path tracing.

The current implementation supports guiding directional sampling decisions on
surfaces, when the material contains a least one diffuse component, and in
volumes with isotropic and anisotropic Henyey-Greenstein phase functions.

On surfaces, the guided sampling decision is proportional to the product of
the incident radiance and the normal-oriented cosine lobe and in volumes it
is proportional to the product of the incident radiance and the phase function.

The incident radiance field of a scene is learned and updated during rendering
after each per-frame rendering iteration/progression.

At the moment, path guiding is only supported by the CPU backend. Support for
GPU backends will be added in future versions of OpenPGL.

Ref T92571

Differential Revision: https://developer.blender.org/D15286
2022-09-27 15:56:32 +02:00
125ac1f914 Cycles: increase min-supported driver version for Intel GPUs
Windows drivers 101.3430 fix an important GUI-related crash and it's
best to prevent users from running into it.
Linux drivers weren't affected but still had relevant gpu binary
compatibility fixes, so it makes sense to keep the min-supported version
aligned across OSes.
2022-09-26 07:41:47 -07:00
Werner, Stefan
0c824837ab Cycles: Cleanup in oneAPI math includes and definitions
Now explicitly including math.h first before #defining funcitons.
This avoids undefined behavior and improves compatibility with
different SYCL compilers and backends.
2022-09-22 11:33:57 +02:00
6d08ba8a50 Fix T100824: Cycles GPU render broken on macOS 13 Beta and Apple silicon
The recent revert of Apple silicon inlining changes to avoid long compile times
worked on macOS 12, but in macOS 13 Beta it results in render errors. This may
be a compiler bug and perhaps get fixed in time, but try to be on the safe side
and ensure Blender 3.3.0 works regardless.

This brings part of the inlining back, which brings improved performance but
also longer compiler times again. Compile time is around 2min now, where the
previous full inlining was about 5-7min.

Patch by Michael Jones.

Differential Revision: https://developer.blender.org/D15897
2022-09-06 19:11:52 +02:00
6c6a53fad3 Cleanup: spelling in comments, formatting, move comments into headers 2022-09-06 16:25:20 +10:00
cf57624764 Cleanup: refactoring of kernel film function names and organization 2022-09-02 17:13:28 +02:00
3e73afb536 Merge branch 'blender-v3.3-release' 2022-08-31 15:34:44 +02:00
b1231e616a Cycles: Enforce Windows driver version requirements for sycl
sycl/L0 runtime reports compute-runtime version since Intel graphics
driver 101.3268 on Windows, when querying driver version from sycl.
Prior to this driver, it was 0. Now we can bump minimum requirement to
this one and filter-out devices returning 0.

Maniphest Tasks: T100648
2022-08-31 15:33:16 +02:00
658ff994c5 Merge branch 'blender-v3.3-release' 2022-08-29 19:21:49 +02:00
805d1063a0 Cycles: Remove "return" and "assert" from oneAPI kernel code 2022-08-29 19:18:50 +02:00
48e1a66af0 Merge branch 'blender-v3.3-release' 2022-08-29 18:21:56 +02:00
1cd8ca49f9 Cycles: Increased minimum supported driver for Windows in oneAPI 2022-08-29 18:10:56 +02:00
d4764a385a Merge branch 'blender-v3.3-release' 2022-08-25 11:50:55 +02:00
9c2bc57cbd Fix Cycles oneAPI for a newer DPC++ compiler version 2022-08-25 11:50:22 +02:00
9961aae1e6 Merge branch 'blender-v3.3-release' 2022-08-18 20:31:34 +02:00
e11c899e71 Cycles: disable Metal inlining optimization on Apple GPUs
This gave a 1.1x speedup, however also leads to very long compile times
that make it seems like Blender has stopped working.

This can be brought back in the future behind an option that users can
explicitly enabled.

Fix T100102

Ref D14923, D14763, T92212
2022-08-18 20:01:29 +02:00
3aeacb9ab3 Merge branch 'blender-v3.3-release' 2022-08-15 13:53:42 +02:00
c2c019dda8 Fix Cycles MetalRT compile error 2022-08-13 19:55:38 +02:00
1988665c3c Cleanup: make vector types make/print functions consistent between CPU and GPU
Now all the same ones are available on CPU and GPU, which was previously not
possible due to lack of operator overloadng in OpenCL. Print functions are
no-ops on some GPUs.

Ref D15535
2022-08-09 16:07:23 +02:00
47f433c776 Merge branch 'blender-v3.3-release' 2022-08-08 11:00:10 -03:00
1382514bf2 Fix: Error in oneAPI image code for texture access with clip extension 2022-08-08 10:47:11 +02:00
fafd1ab9d3 Merge branch 'blender-v3.3-release' 2022-08-05 19:49:12 +02:00
fa514564b0 Fix T99201: Cycles render difference with 3D hair curves between OptiX and Emrbee
It should consistently use the Cycles pirmitive ID for self intersection detection,
not the one from the OptiX or Embree acceleration structure.

Differential Revision: https://developer.blender.org/D15632
2022-08-05 15:03:47 +02:00
19a7a013ce Merge branch 'blender-v3.3-release' 2022-08-01 14:37:16 +02:00
76169472d3 Cycles: Resolve recent performance regression in oneAPI implementation for Intel® Arc™ GPUs
Recently, performance with oneAPI have regressed due some recent
changes in Blender itself. This commit's changes is resolving this
and also improve compilation time for oneAPI backend first
execution (or Blender compilation time in case of AoT).

Regression have appeared after 5152c7c152 and not related to the
changes itself, but increase of kernels complexity introduced with
it. Changes in this commit is marking some Blender functions as
noinlined for oneAPI backend, which helps GPU compiler to deal with
this complexity without any negative side-effects on performance.
2022-08-01 12:45:34 +02:00
79ab76e156 Cleanup: simplifications and consistency for vector types
* OneAPI: remove separate float3 definition
* OneAPI: disable operator[] to match other GPUs
* OneAPI: make int3 compact to match other GPUs
* Use #pragma once
* Add __KERNEL_NATIVE_VECTOR_TYPES__ to simplify checks
* Remove unused vector3
2022-07-28 21:27:13 +02:00
38af5b0501 Cycles: switch Cycles triangle barycentric convention to match Embree/OptiX
Simplifies intersection code a little and slightly improves precision regarding
self intersection.

The parametric texture coordinate in shader nodes is still the same as before
for compatibility.
2022-07-27 21:03:33 +02:00