Commit Graph

23 Commits

Author SHA1 Message Date
ecdfb465cc Cycles: Fix usage of memory barriers in split kernel
On user level this fixes dead-lock of OpenCL render on Intel Iris GPUs.
Note that this patch does not include change in the logic which allows
or disallows OpenCL platforms to be used, that will happen after the
kernel fix is known to be fine for the currently officially supported
platforms.

The dead-lock was caused by wrong usage of memory barriers: as per the
OpenCL specification the barrier is to be executed by the entire work
group. This means, that the following code is invalid:

  void foo() {
    if (some_condition) {
      return;
    }
    barrier(CLK_LOCAL_MEM_FENCE);
  }

  void bar() {
    foo();
  }

The Cycles code was mentioning this as an invalid code on CPU, while in
fact this is invalid as per specification. From the implementation side
this change removes the ifdefs around the CPU-only barrier logic, and
is implementing similar logic in the shader setup kernel.

Tested on NUC8i7HVK NUC.

The root cause of the dead-lock was identified by Max Dmitrichenko.

There is no measurable difference in performance of currently supported
OpenCL platforms.

Differential Revision: https://developer.blender.org/D9039
2020-09-30 16:10:35 +02:00
Lukas Stockner
3437c9c3bf Cycles: perform clamping per light contribution instead of whole path
With upcoming light group passes, for them to sum up correctly to the combined
pass the clamping must be more fine grained.

This also has the advantage that if one light is particularly noisy, it does
not diminish the contribution from other lights which do not need as much
clamping.

Clamp values on existing scenes will need to be tweaked to get similar results,
there is no automatic conversion possible which would give the same results as
before.

Implemented by Lukas, with tweaks by Brecht.

Part of D4837
2019-12-12 13:04:43 +01:00
c47d669f24 Cleanup: comments (long lines) in cycles 2019-05-01 21:41:07 +10:00
e12c08e8d1 ClangFormat: apply to source, most of intern
Apply clang format as proposed in T53211.

For details on usage and instructions for migrating branches
without conflicts, see:

https://wiki.blender.org/wiki/Tools/ClangFormat
2019-04-17 06:21:24 +02:00
Stefan Werner
e58c6cf0c6 Cycles: Added Cryptomatte output.
This allows for extra output passes that encode automatic object and material masks
for the entire scene. It is an implementation of the Cryptomatte standard as
introduced by Psyop. A good future extension would be to add a manifest to the
export and to do plenty of testing to ensure that it is fully compatible with other
renderers and compositing programs that use Cryptomatte.

Internally, it adds the ability for Cycles to have several passes of the same type
that are distinguished by their name.

Differential Revision: https://developer.blender.org/D3538
2018-10-28 05:37:41 -04:00
8a72be7697 Cycles: reduce closure memory usage for emission/shadow shader data.
With a Titan Xp, reduces path trace local memory from 1092MB to 840MB.
Benchmark performance was within 1% with both RX 480 and Titan Xp.

Original patch was implemented by Sergey.

Differential Revision: https://developer.blender.org/D2249
2017-11-05 20:48:33 +01:00
5bb677e592 Code refactor: zero render buffers outside of kernel.
This was originally done with the first sample in the kernel for better
performance, but it doesn't work anymore with atomics. Any benefit was
very minor anyway, too small to measure it seems.
2017-10-04 21:11:14 +02:00
e3e16cecc4 Code refactor: remove rng_state buffer and compute hash on the fly.
A little faster on some benchmark scenes, a little slower on others, seems
about performance neutral on average and saves a little memory.
2017-10-04 21:11:14 +02:00
5b7d6ea54b Code refactor: add WorkTile struct for passing work to kernel.
This makes sharing some code between mega/split in following commits a bit
easier, and also paves the way for rendering multiple tiles later.
2017-10-04 21:11:14 +02:00
07ec0effb6 Code cleanup: simplify kernel side work stealing code. 2017-09-21 22:29:18 +02:00
37d9e65ddf Code cleanup: abstract shadow catcher logic more into accumulation code. 2017-09-13 15:24:14 +02:00
049932c4c3 Fix panorama render crash with split kernel, due to incorrect buffer pointer.
Also some refactoring to clarify variable usage scope.
2017-08-22 00:41:07 +02:00
cfa8b762e2 Code cleanup: move rng into path state.
Also pass by value and don't write back now that it is just a hash for seeding
and no longer an LCG state. Together this makes CUDA a tiny bit faster in my
tests, but mainly simplifies code.
2017-08-19 18:14:16 +02:00
dc7fcebb33 Code cleanup: make L_transparent part of PathRadiance. 2017-08-13 01:19:07 +02:00
7542282c06 Code cleanup: make DebugData part of PathRadiance. 2017-08-13 01:19:07 +02:00
601f94a3c2 Code cleanup: remove unused Cycles random number code. 2017-08-12 20:40:38 +02:00
8e655446d1 Fix T51537: Light passes are summed twice for split kernel since denoise commit
Denoise commit introduced kernel_write_result() which saves light passes, so
no need to call both kernel_write_result() and kernel_write_light_passes() from
the split kernel.

Weirdly enough. kernel_write_result() does not take care about debug passes.
2017-05-19 12:14:03 +02:00
43b374e8c5 Cycles: Implement denoising option for reducing noise in the rendered image
This commit contains the first part of the new Cycles denoising option,
which filters the resulting image using information gathered during rendering
to get rid of noise while preserving visual features as well as possible.

To use the option, enable it in the render layer options. The default settings
fit a wide range of scenes, but the user can tweak individual settings to
control the tradeoff between a noise-free image, image details, and calculation
time.

Note that the denoiser may still change in the future and that some features
are not implemented yet. The most important missing feature is animation
denoising, which uses information from multiple frames at once to produce a
flicker-free and smoother result. These features will be added in the future.

Finally, thanks to all the people who supported this project:

- Google (through the GSoC) and Theory Studios for sponsoring the development
- The authors of the papers I used for implementing the denoiser (more details
  on them will be included in the technical docs)
- The other Cycles devs for feedback on the code, especially Sergey for
  mentoring the GSoC project and Brecht for the code review!
- And of course the users who helped with testing, reported bugs and things
  that could and/or should work better!
2017-05-07 14:40:58 +02:00
Hristo Gueorguiev
e07ffcbd1c Cycles: Add OpenCL support for shadow catcher feature
The title says it all actually.
2017-03-27 10:46:59 +02:00
Hristo Gueorguiev
8ada7f7397 Cycles: Remove ccl_addr_space from RNG passed to functions
Simplifies code quite a bit, making it shorter and easier to extend.
Currently no functional changes for users, but is required for the
upcoming work of shadow catcher support with OpenCL.
2017-03-27 10:46:28 +02:00
1cad64900e Cycles: Define ccl_local variables in kernel functions
Declaring ccl_local in a device function is not supported
by certain compilers.
2017-03-16 11:27:17 +01:00
76acaefdd7 Cycles: Cleanup, wipe obviously outdated parts of split kernel comments 2017-03-13 17:16:16 +01:00
Hristo Gueorguiev
57e26627c4 Cycles: SSS and Volume rendering in split kernel
Decoupled ray marching is not supported yet.

Transparent shadows are always enabled for volume rendering.

Changes in kernel/bvh and kernel/geom are from Sergey.
This simiplifies code significantly, and prepares it for
record-all transparent shadow function in split kernel.
2017-03-09 17:09:37 +01:00