The idea is to make main thread and job threads to be scheduled
on CPU dies which has direct access to memory (those are NUMA
nodes 0 and 2).
We also do this for new EPYC CPUs since their NUMA nodes 1 and 3
do have access but only to a higher range DDR slots. By preferring
nodes 0 and 2 on EPYC we make it so users with partially filled
DDR slots has fast memory access.
One thing which is not really solved yet is localization of
memory allocation: we do not guarantee that memory is allocated
on the closest to the NUMA node DDR slot and hope that memory
manager of OS is acting in favor of us.
Second part of the fix: do not try at all to compute normals in degenerated
geometry. Just loss of time and potential issues later with weird
invalid computed values.
The goal is to address performance regression when going from
few threads to 10s of threads. On a systems with more than 32
CPU threads the benefit of threaded loop was actually harmful.
There are following tweaks now:
- The chunk size is adaptive for the number of threads, which
minimizes scheduling overhead.
- The number of tasks is adaptive to the list size and chunk
size.
Here comes performance comparison on the production shot:
Number of threads DEG time before DEG time after
44 0.09 0.02
32 0.055 0.025
16 0.025 0.025
8 0.035 0.033
Note that this is turned off by default and must be enabled at build time with the CMake WITH_CYCLES_EMBREE flag.
Embree must be built as a static library with ray masking turned on, the `make deps` scripts have been updated accordingly.
There, Embree is off by default too and must be enabled with the WITH_EMBREE flag.
Using Embree allows for much faster rendering of deformation motion blur while reducing the memory footprint.
TODO: GPU implementation, deduplication of data, leveraging more of Embrees features (e.g. tessellation cache).
Differential Revision: https://developer.blender.org/D3682
The code this was taken from assumes a 'size_t' result,
which isn't the case here.
In practice the bucket distribution wasn't bad,
even so this was a nop so best fix.
Looks like we need to explicitly set i18n language to default value on
some OSs... Unless that 'need to create new translated-name IDs in
versionning code for startup file' situation is really seldom.
Anyway, hopefully that will fix the crash.
This is to solve an issue where a blend file could be compressed
unbeknownst to the artist. This happened in the following situtation:
- Artist edits an uncompressed blend file.
- Some script saves a compressed blendfile to a separate location.
- When the artist saves the file (s)he is editing (File>Save, or Ctrl+S),
it was silently compressed.
Cherry pick from: cd3b313d5f
This is a bug experienced by animators in the Blender Studio that developers
have been trying to fix for a /long/ time.
What happens is that partial file writing extracts the needed datablocks from
the main list of datablocks into a smaller one. Afterwards they are added back
to the main list, but in some cases not exactly in the same order.
There is file path remapping code that depends on the datablocks being in
exactly the same order as before, and when this was not the case filepaths
would get swapped between datablocks
The reason datablocks are not restored in the same order is because the sorting
of datablocks by name is a) case insensitive and b) undefined if there are
multiple datablocks with the same name from different libraries. This should
be made well defined, but the fix in this commit is simpler.
The way animators ran into this bug is that they use the Copy Attributes addon
a lot, which has as the first item in the menu Copy Selection to Buffer. In
some cases this would be clicked accidentally when menu is near the edge of the
window, breaking the library paths which would only be noticed a much later on
file save and reload.
The way this bug was finally tracked down is that it was suspected that the
undo system was the cause, and so Bastien added library validation for undo.
When Hjalti then did undo and noticed the error, he remembered accidentally
clicking Copy Selection to Buffer just before, and we could finally reproduce
the bug.
There are serious suspicions that weird corruptions faced by studio
artists may happen in undo/redo code, so let's see whether that's the
case.
With this, and when --debug-io arg is passed on startup, the whole lib
data are checked at every undo. This makes undo slower (from two to
three times slower), but it could help us spot better what happens...
At least on windows we do not re-run datatoc when the .glsl files change.
To test is simple, just change edit_mesh_overlay_common_lib.glsl
remove lines, write plain text, ..., now rebuild and go in edit mode
with the default cube.
I also had to remove the entry in gpu/CMakeLists.txt for
gpu_shader_material.glsl since this was being tracked directly, as well
as running data_to_c_simple (otherwise CMake raises an error for
duplicated entries).
We probably want to do the same for the other datatoc functions.
Reviewers: LazyDodo, brecht
Differential Revision: https://developer.blender.org/D3803