The main goals of this change is faster starting when using foreground
rendering.
This patch will build kernels in parallel to the update process of
the scene. When these optimized kernels are not available (yet) an AO
kernel will be used.
These AO kernels are fast to compile (3-7 seconds) and can be
reused by all scenes. When the final kernels become available we
will switch to these kernels.
In background mode the AO kernels will not be used.
Some kernels are being used during Scene update (displace, background
light). When these kernels are being used the process can halt until
these become available.
Reviewed By: brecht, #cycles
Maniphest Tasks: T61752
Differential Revision: https://developer.blender.org/D4428
Part of the cleanup of the OpenCL codebase.
Single program is not effective when using OpenCL, it is slower
to compile and slower during rendering (when used in for example
`barbershop` or `victor`).
Reviewers: brecht, #cycles
Maniphest Tasks: T62267
Differential Revision: https://developer.blender.org/D4481
Float2 are now a new type for attributes in Cycles. Before, the choices
for attribute storage were float and float3, the latter padded to
float4. This meant that UV maps were inflated to twice the size
necessary.
Reviewers: brecht, sergey
Reviewed By: brecht
Subscribers: #cycles
Tags: #cycles
Differential Revision: https://developer.blender.org/D4409
Using OpenCL MegaKernel has been slow and therefore not usefull.
This patch will remove the mega kernel from the OpenCL codebase
and the OpenCLDeviceBase class.
T61736: removal of mega kernel
T61703: baking does not work with mega kernel
Tags: #cycles
Differential Revision: https://developer.blender.org/D4383
This patch implements a workaround to get the multithreaded compilation from D2231 working.
So far, it only works for Blender, not for Cycles Standalone. Also, I have only tested the Linux codepath in the helper function.
Depends on D2231.
Patch by lukasstockner97, jbakker, brecht
job | scene_name | compilation_time
----------+-----------------+------------------
Baseline | empty | 22.73
D2264 | empty | 13.94
Baseline | bmw | 56.44
D2264 | bmw | 41.32
Baseline | fishycat | 59.50
D2264 | fishycat | 45.19
Baseline | barbershop | 212.28
D2264 | barbershop | 169.81
Baseline | victor | 67.51
D2264 | victor | 53.60
Baseline | classroom | 51.46
D2264 | classroom | 39.02
Baseline | koro | 62.48
D2264 | koro | 49.03
Baseline | pavillion | 54.37
D2264 | pavillion | 38.82
Baseline | splash279 | 47.43
D2264 | splash279 | 37.94
Baseline | volume_emission | 145.22
D2264 | volume_emission | 121.10
This patch reduced compilation time as the split kernels and base
kernels are compiled in parallel. In cycles debug mode (256) you can set
unmark the opencl single program file, what reduces the compilation time
even further (bmw 17 seconds, barbershop 53 seconds).
Reviewers: brecht, dingto, sergey, juicyfruit, lukasstockner97
Reviewed By: brecht
Subscribers: Loner, jbakker, candreacchio, 3dLuver, LazyDodo, bliblubli
Differential Revision: https://developer.blender.org/D2264
This patch implements a workaround to get the multithreaded compilation from D2231 working.
So far, it only works for Blender, not for Cycles Standalone. Also, I have only tested the Linux codepath in the helper function.
Depends on D2231.
Reviewers: brecht, dingto, sergey, juicyfruit, lukasstockner97
Reviewed By: brecht
Subscribers: Loner, jbakker, candreacchio, 3dLuver, LazyDodo, bliblubli
Differential Revision: https://developer.blender.org/D2264
For OIIO 2.x we must use unique_ptr. This also required updating the
guarded allocator for std::move to work. Since C++11 construct/destroy
have a default implementation that also works this case, so we just
leave it out.
This adds a cycles.denoise_animation operator, which denoises an animation
sequence or individual file. Renders must be saved as multilayer EXR files
with denoising data passes.
By default file path and frame range come from the current scene, and EXR
files are denoised in-place. Alternatively, a different input and/or output
file path can be provided.
Denoising settings come from the current view layer. Renders can be denoised
again with different settings, as the original noisy image is preserved along
with other passes and metadata.
There is no user interface yet for this feature, that comes later.
Code by Lukas with modifications by Brecht. This feature was originally
developed for Tangent Animation, thanks for the support!
Differential Revision: https://developer.blender.org/D3889
This adds a cycles.denoise_animation operator, which denoises an animation
sequence or individual file. Renders must be saved as multilayer EXR files
with denoising data passes.
By default file path and frame range come from the current scene, and EXR
files are denoised in-place. Alternatively, a different input and/or output
file path can be provided.
Denoising settings come from the current view layer. Renders can be denoised
again with different settings, as the original noisy image is preserved along
with other passes and metadata.
There is no user interface yet for this feature, that comes later.
Code by Lukas with modifications by Brecht. This feature was originally
developed for Tangent Animation, thanks for the support!
BF-admins agree to remove header information that isn't useful,
to reduce noise.
- BEGIN/END license blocks
Developers should add non license comments as separate comment blocks.
No need for separator text.
- Contributors
This is often invalid, outdated or misleading
especially when splitting files.
It's more useful to git-blame to find out who has developed the code.
See P901 for script to perform these edits.
BF-admins agree to remove header information that isn't useful,
to reduce noise.
- BEGIN/END license blocks
Developers should add non license comments as separate comment blocks.
No need for separator text.
- Contributors
This is often invalid, outdated or misleading
especially when splitting files.
It's more useful to git-blame to find out who has developed the code.
See P901 for script to perform these edits.
This change brings back old original logic which was checking
whether worker threads do fit into an active CPU group. But
it does it a bit smarter now and is also checking affinity
within that group. This way Cycles will use all threads on a
Threadripper2 CPU if it's set to automatic number of threads,
but on another hand will not change affinity if user requested
16 threads and changed Blender affinity.
Tweaked scheduling so it survives this situation by scattering
"extra" threads uniformly over all the NUMA nodes.
There are still tweaks possible to make some specific hardware
configurations work better.
The issue was introduced by a Threadripper2 commit back in
ce927e15e0. This boils down to threads inheriting affinity
from the parent thread. It is a question how this slipped
through the review (we definitely run benchmark round).
Quick fix could have been to always set CPU group affinity
in Cycles, and it would work for Windows. On other platforms
we did not have CPU groups API finished.
Ended up making Cycles aware of NUMA topology, so now we
bound threads to a specific NUMA node. This required adding
an external dependency to Cycles, but made some code there
shorter.
There are some changes in API of OpenImageIO, but those are quite
simple to keep working with older and newer library versions.
Reviewers: brecht
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D4064