In that case it can now fall back to CPU memory, at the cost of reduced
performance. For scenes that fit in GPU memory, this commit should not
cause any noticeable slowdowns.
We don't use all physical system RAM, since that can cause OS instability.
We leave at least half of system RAM or 4GB to other software, whichever
is smaller.
For image textures in host memory, performance was maybe 20-30% slower
in our tests (although this is highly hardware and scene dependent). Once
other type of data doesn't fit on the GPU, performance can be e.g. 10x
slower, and at that point it's probably better to just render on the CPU.
Differential Revision: https://developer.blender.org/D2056
SVM nodes need to read all data to get the right offset for the following node.
This is quite weak, a more generic solution would be good in the future.
This allows a duplicator (as known as dupli parent) to be in a visible
collection so its duplicated objects are visible, however while being
invisible for the final render.
An object that is a particle emitter is also considered a duplicator.
Many thanks for the reviewers for the extense feedback.
Reviewers: sergey, campbellbarton
Differential Revision: https://developer.blender.org/D2966
We can not store pointers to elements of collection property in the
case we modify that collection. This is like storing pointers to
elements of array before calling realloc().
Our own implementation was behaving different comparing to OSL and GPU,
namely on the border pixels OSL and CUDA was doing interpolation with
black, but we were clamping coordinate.
This partially fixes issue reported in T53452.
Similar change should also be done for 3D interpolation perhaps, but this
is to be investigated separately.
Previously, the NLM kernels would be launched once per offset with one thread per pixel.
However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown.
Therefore, the kernels are now launched in a single call that handles all offsets at once.
This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory.
On the other hand, of course, the smaller tiles significantly reduce the size of the memory.
The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum.
I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere.
To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now.
Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.
The RenderResult struct still has a listbase of RenderLayer, but that's ok
since this is strictly for rendering.
* Subversion bump (to 2.80.2)
* DNA low level doversion (renames) - only for .blend created since 2.80 started
Note: We can't use DNA_struct_elem_find or get file version in init_structDNA,
so we are manually iterating over the array of the SDNA elements instead.
Note 2: This doversion change with renames can be reverted in a few months. But
so far it's required for 2.8 files created between October 2016 and now.
Reviewers: campbellbarton, sergey
Differential Revision: https://developer.blender.org/D2927
This patch moves all the functionality previously in SceneRenderLayer to SceneLayer.
If we want to rename some of these structs now would be a good time to do it, before they are in SceneLayer.
Everything should be working, though I will test things further tomorrow. Once this is committed depsgraph can get
rid of the workaround added in rna_Main_meshes_new_from_object and finish whatever this patch was preventing from being finished.
This patch also adds a few placeholders for the overrides (samples, ...). These are obviously not working, so some unittests that rely on 'lay', and 'zmask' will fail.
This patch does not addressed the change of moving samples to ViewRender (I have this as a separate patch and needs some separate discussion).
Following next is the individual note of the individual parts that were committed.
Note 1: It is up to Cycles to still get rid of exclude_layer internally.
Note 2: Cycles still need to handle its own doversion for the use_layer_samples cases and
(1) Remove the override as it is
(2) Add a new override (scene.cycles.samples) if scene.cycles.use_layer_samples != IGNORE
Respecting the expected behaviour when scene.cycles.use_layer_samples == BOUNDED.
Note 3: Cycles still need to implement the per-object holdout
(similar to how we do shadow catcher).
Note 4: There are parts of the old (Blender Internal) rendering pipeline that is still
using lay, e.g., in shi->lay.
Honestly it will be easier to purge the entire Blender Internal code away instead of taking things from it bit by bit.
Reviewers: sergey, campbellbarton, brecht
Differential Revision: https://developer.blender.org/D2919
There are parts of the old (Blender Internal) rendering pipeline that is still
using lay, e.g., in shi->lay.
Honestly it will be easier to purge the entire Blender Internal code away instead
of taking things from it bit by bit.
Note: Cycles still need to handle its own doversion for theses cases and
(1) Remove the override as it is
(2) Add a new override (scene.cycles.samples) if scene.cycles.use_layer_samples != IGNORE
Respecting the expected behaviour when scene.cycles.use_layer_samples == BOUNDED.
This is tricky since we may want granular polling depending on the setting.
Or an option to pick whether we want the context or the scene to drive the
panels to prevent too many panels when mixing Eevee and Cycles for example.