Commit Graph

182 Commits

Author SHA1 Message Date
91c44fe885 Cycles: disable NanoVDB for AMD OpenCL
It is causing issue with AMD OpenCL drivers, due to a potential driver bug.

Ref T84461
2021-03-30 00:00:17 +02:00
f4f8b6dde3 Cycles: Change device-only memory to actually only allocate on the device
This patch changes the `MEM_DEVICE_ONLY` type to only allocate on the device and fail if
that is not possible anymore because out-of-memory (since OptiX acceleration structures may
not be allocated in host memory). It also fixes high peak memory usage during OptiX
acceleration structure building.

Reviewed By: brecht

Maniphest Tasks: T85985

Differential Revision: https://developer.blender.org/D10535
2021-03-11 14:12:35 +01:00
3732508c64 Fix T84745: build error with TBB 2021
task_group::is_canceling() was removed.
2021-01-15 17:29:36 +01:00
Lukas Stockner
688e5c6d38 Fix T82351: Cycles: Tile stealing glitches with adaptive sampling
In my testing this works, but it requires me to remove the min(start_sample...) part in the
adaptive sampling kernel, and I assume there's a reason why it was there?

Reviewed By: brecht

Maniphest Tasks: T82351

Differential Revision: https://developer.blender.org/D9445
2021-01-11 21:04:49 +01:00
bfb6fce659 Cycles: Add CPU+GPU rendering support with OptiX
Adds support for building multiple BVH types in order to support using both CPU and OptiX
devices for rendering simultaneously. Primitive packing for Embree and OptiX is now
standalone, so it only needs to be run once and can be shared between the two. Additionally,
BVH building was made a device call, so that each device backend can decide how to
perform the building. The multi-device for instance creates a special multi-BVH that holds
references to several sub-BVHs, one for each sub-device.

Reviewed By: brecht, kevindietrich

Differential Revision: https://developer.blender.org/D9718
2020-12-11 13:24:29 +01:00
517ff40b12 Cycles: Implement tile stealing to improve CPU+GPU rendering performance
While Cycles already supports using both CPU and GPU at the same time, there
currently is a large problem with it: Since the CPU grabs one tile per thread,
at the end of the render the GPU runs out of new work but the CPU still needs
quite some time to finish its current times.

Having smaller tiles helps somewhat, but especially OpenCL rendering tends to
lose performance with smaller tiles.

Therefore, this commit adds support for tile stealing: When a GPU device runs
out of new tiles, it can signal the CPU to release one of its tiles.
This way, at the end of the render, the GPU quickly finishes the remaining
tiles instead of having to wait for the CPU.

Thanks to AMD for sponsoring this work!

Differential Revision: https://developer.blender.org/D9324
2020-10-31 01:57:39 +01:00
3e3e42cbcb Cycles: Cleanup, mark overridden virtual methods as such
Solves strict Clang warnings reported on Linux.
2020-09-04 12:37:25 +02:00
Stefan Werner
009971ba7a Cycles: Separate Embree device for each CPU Device.
Before, Cycles was using a shared Embree device across all instances.
This could result in crashes when viewport rendering and material
preview were using Cycles simultaneously.

Fixes issue T80042

Maniphest Tasks: T80042

Differential Revision: https://developer.blender.org/D8772
2020-09-01 21:00:55 +02:00
5338b36fcc Cleanup: spelling 2020-07-14 15:19:52 +10:00
6e74a8b69f Fix T78881: Cycles OpenImageDenoise not using albedo and normal correctly
Properly normalize buffers now. Also expose option to not use albedo and normal
just like OptiX.
2020-07-13 19:38:49 +02:00
3dc0178390 Fix T78662: Cycles baking fails if denoising is enabled, after recent changes
This is not supported yet.
2020-07-10 20:08:46 +02:00
6fbacd6048 Fix build error building without OpenImageDenoise 2020-07-10 19:56:53 +02:00
6eeb32706a Cycles: support OpenImageDenoise in final renders
Performance is not great currently due to the API not seeming to support
efficient denoising of multiple tiles at the same time. So in many cases
only one or a few threads will actually be denoising at the same time.

In renders with many samples this is not a big problem, but for faster
renders it's a signficant overhead.

We should try to optimize this still, possibly by batching denoising of
a bigger neighborhood of multiple tiles at once.
2020-07-10 17:10:05 +02:00
93791381fe Cleanup: reduce hardcoded numbers in denoising neighbor tiles code 2020-07-10 17:10:05 +02:00
669befdfbe Cycles: add Intel OpenImageDenoise support for viewport denoising
Compared to Optix denoise, this is usually slower since there is no GPU
acceleration. Some optimizations may still be possible, in avoid copies
to the GPU and/or denoising less often.

The main thing is that this adds viewport denoising support for computers
without an NVIDIA GPU (as long as the CPU supports SSE 4.1, which is nearly
all of them).

Ref T76259
2020-06-24 15:17:36 +02:00
0a3bde6300 Cycles: add denoising settings to the render properties
Enabling render and viewport denoising is now both done from the render
properties. View layers still can individually be enabled/disabled for
denoising and have their own denoising parameters.

Note that the denoising engine also affects how denoising data passes are
output even if no denoising happens on the render itself, to make the passes
compatible with the engine.

This includes internal refactoring for how denoising parameters are passed
along, trying to avoid code duplication and unclear naming.

Ref T76259
2020-06-24 15:17:36 +02:00
d1ef5146d7 Cycles: remove SIMD BVH optimizations, to be replaced by Embree
Ref T73778

Depends on D8011

Maniphest Tasks: T73778

Differential Revision: https://developer.blender.org/D8012
2020-06-22 13:28:01 +02:00
54e3487c9e Cleanup: remove task pool stop() and finished() 2020-06-22 13:06:47 +02:00
b10b7cdb43 Cleanup: use lambdas instead of functors for task pools, remove threadid 2020-06-22 13:06:47 +02:00
ace3268482 Cleanup: minor refactoring around DeviceTask 2020-06-22 13:06:47 +02:00
d9773edaa3 Cycles: code refactor to bake using regular render session and tiles
There should be no user visible change from this, except that tile size
now affects performance. The goal here is to simplify bake denoising in
D3099, letting it reuse more denoising tiles and pass code.

A lot of code is now shared with regular rendering, with the two main
differences being that we read some render result passes from the bake API
when starting to render a tile, and call the bake kernel instead of the
path trace kernel.

With this kind of design where Cycles asks for tiles from the bake API,
it should eventually be easier to reduce memory usage, show tiles as
they are baked, or bake multiple passes at once, though there's still
quite some work needed for that.

Reviewers: #cycles

Subscribers: monio, wmatyjewicz, lukasstockner97, michaelknubben

Differential Revision: https://developer.blender.org/D3108
2020-05-15 20:25:24 +02:00
53981c7fb6 Cleanup: refactor adaptive sampling to more easily change some parameters
No functional changes yet, this is work towards making CPU and GPU results
match more closely.
2020-04-07 20:29:48 +02:00
26bea849cf Cleanup: add device_texture for images, distinct from other global memory
There was too much image texture specific stuff in device_memory, and too
much code duplication between devices.
2020-03-12 17:28:55 +01:00
f01bc597a8 Cleanup: stop encoding image data type in slot index
This is legacy code from when we had a fixed number of textures.
2020-03-11 17:07:17 +01:00
dcdcc23488 Fix T74504: Cycles wrong progress bar with CPU adaptive sampling 2020-03-06 23:46:58 +01:00
c60366c01f Cycles: cleanup warning 2020-03-06 15:27:50 +01:00
c8ac760c59 Cleanup: tweak Cycles #includes in preparation for clang-format sorting 2020-03-06 14:44:42 +01:00
Stefan Werner
51e898324d Adaptive Sampling for Cycles.
This feature takes some inspiration from
"RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and
"A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination"

The basic principle is as follows:
While samples are being added to a pixel, the adaptive sampler writes half
of the samples to a separate buffer. This gives it two separate estimates
of the same pixel, and by comparing their difference it estimates convergence.
Once convergence drops below a given threshold, the pixel is considered done.

When a pixel has not converged yet and needs more samples than the minimum,
its immediate neighbors are also set to take more samples. This is done in order
to more reliably detect sharp features such as caustics. A 3x3 box filter that
is run periodically over the tile buffer is used for that purpose.

After a tile has finished rendering, the values of all passes are scaled as if
they were rendered with the full number of samples. This way, any code operating
on these buffers, for example the denoiser, does not need to be changed for
per-pixel sample counts.

Reviewed By: brecht, #cycles

Differential Revision: https://developer.blender.org/D4686
2020-03-05 12:21:38 +01:00
af54bbd61c Cycles: Rework tile scheduling for denoising
This fixes denoising being delayed until after all rendering has finished. Instead, tile-based
denoising is now part of the "RENDER" task again, so that it is all in one task and does not
cause issues with dedicated task pools where tasks are serialized.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D6940
2020-02-28 16:12:29 +01:00
Ray molenkamp
9339dc6dd1 Fix T70685: Cycles crash using WITH_CYCLES_NATIVE_ONLY on Windows
MSVC does not have -march=native, so the kernel gets built without AVX2 and
BVH8 support. The code assumed it to be available and crashed

Differential Revision: https://developer.blender.org/D6082
2020-02-14 13:55:11 +01:00
38589de10c Cycles: Add support for denoising in the viewport
The OptiX denoiser can be a great help when rendering in the viewport, since it is really fast
and needs few samples to produce convincing results. This patch therefore adds support for
using any Cycles denoiser in the viewport also (but only the OptiX one is selectable because
the NLM one is too slow to be usable currently). It also adds support for denoising on a
different device than rendering (so one can e.g. render with the CPU but denoise with OptiX).

Reviewed By: #cycles, brecht

Differential Revision: https://developer.blender.org/D6554
2020-02-11 18:03:43 +01:00
1d1ef2e797 Revert part of "GPencil: Invert Paste operator and make Paste to Active default"
This commit accidentally undid a bunch of previous commits. Only the intended
changes are left now.
2019-09-23 11:09:00 +02:00
51d9f56f87 GPencil: Invert Paste operator and make Paste to Active default
Before there were two options: Paste to original layer called "Paste" and Paste to active   layer called "Paste & Merge"

Now, by default the paste is in active layer and the "Paste & Merge" has been renamed "Paste".

For old "Paste", now is called "Paste by Layer" and it's not the default value anymore.

Note: Minor edits to add icons not present in Differential revision.

Differential Revision: https://developer.blender.org/D5591
2019-08-26 15:49:16 +02:00
b9f61eb874 Cycles: Fix Architecture logging on x64.
x64 builds with WITH_CYCLES_OPTIMIZED_KERNEL_SSE2 not defined
since SSE2 is the lower bar for x64 cpus. Turning the architecture
logging related if into the last if in the architecture detection
chain, which will never execute unless you turn off all kernels
in de debug flags.

Reviewers: brecht

Differential Revision: https://developer.blender.org/D5579
2019-08-26 07:22:44 -06:00
7ad21c3876 Fix T66604: Cycles bake crash on specific scene with volume
The issue was caused by un-initialized local storage for volume
intersection hits which are supposed to be stored in per-thread
KernelGlobals.

Fix is to make thread_shader() be the same as thread_render() in
respect of KernelGlobals.

Reviewers: brecht

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D5230
2019-07-11 15:44:09 +02:00
fced0f0437 Cycles: Don't advertise BVH8 being supported on 32bit platforms
The kernel does not use AVX2 vectorization, and trying to use BVH8 was
leading to an empty scenes.

Fixes T64624: Ctest : Win32 + AVX2 fails virtually all cycles tests
2019-05-16 11:51:25 +02:00
b63ffa8919 Fix Cycles build error after recent changes
We need to do aligned alloc of the services instead of globals now since the
concurrent map moved there.
2019-05-14 15:06:23 +02:00
a5c89574a3 Fix Cycles assert on exit after recent changes 2019-05-03 18:04:47 +02:00
fadb6f3466 Cleanup: refactor Cycles OSL texture handling
This adds our own OSL texture handle, that has info for OIIO textures or our
own custom texture types. A filename to handle hash map is used for lookups.
This is efficient because it happens at OSL compile time, because the optimizer
can figure out constant strings and replace them with texture handles.
2019-05-03 15:36:20 +02:00
e12c08e8d1 ClangFormat: apply to source, most of intern
Apply clang format as proposed in T53211.

For details on usage and instructions for migrating branches
without conflicts, see:

https://wiki.blender.org/wiki/Tools/ClangFormat
2019-04-17 06:21:24 +02:00
e691929686 Merge branch 'blender2.7' 2019-03-17 12:54:19 +01:00
e17f7af0ce Cleanup: remove Cycles advanced shading features toggle.
It's effectively always enabled, only not on some unsupported OpenCL devices.
For testing those it's not useful to disable these features. This is replaced
by the more fine grained feature toggles that we have now.
2019-03-17 01:58:39 +01:00
ebcea3029d Merge branch 'blender2.7' 2019-03-06 13:45:21 +01:00
f08191a459 Fix Cycles build error on non-x86 processors. 2019-03-06 13:37:06 +01:00
e21ae0bb26 Merge branch 'blender2.7' 2019-02-06 15:22:53 +01:00
fccf506ed7 Cycles: animation denoising support in the kernel.
This is the internal implementation, not available from the API or
interface yet. The algorithm takes into account past and future frames,
both to get more coherent animation and reduce noise.

Ref D3889.
2019-02-06 15:18:42 +01:00
405cacd4cd Cycles: prefilter feature passes separate from denoising.
Prefiltering of feature passes will happen during rendering, which can
then be used for denoising immediately or written as a render pass for
later (animation) denoising.

The number of denoising data passes written is reduced because of this,
leaving out the feature variance passes. The passes are now Normal,
Albedo, Depth, Shadowing, Variance and Intensity.

Ref D3889.
2019-02-06 15:18:29 +01:00
Stefan Werner
5e121c8eab Cycles: Fixed uninitialized memory
Cryptomatte on CPU with accurate mode was hitting uninitialized variables.
This is now explicitly initializing them to NULL.
2019-01-18 15:17:21 +01:00
a8b8da5567 Fix T58183: crash with CPU + GPU rendering after profiling changes.
Multi-device was not passing along profiler to the CPU.
2018-11-29 23:43:27 +01:00
7fa6f72084 Cycles: Add sample-based runtime profiler that measures time spent in various parts of the CPU kernel
This commit adds a sample-based profiler that runs during CPU rendering and collects statistics on time spent in different parts of the kernel (ray intersection, shader evaluation etc.) as well as time spent per material and object.

The results are currently not exposed in the user interface or per Python yet, to see the stats on the console pass the "--cycles-print-stats" argument to Cycles (e.g. "./blender -- --cycles-print-stats").

Unfortunately, there is no clear way to extend this functionality to CUDA or OpenCL, so it is CPU-only for now.

Reviewers: brecht, sergey, swerner

Reviewed By: brecht, swerner

Differential Revision: https://developer.blender.org/D3892
2018-11-29 02:45:24 +01:00