Partially revert 41216d5ad4
Some of this code had comments to be left as is for readability,
or comment the code should be kept.
Other functions were only for debugging.
Terms get/set don't make much sense when casting values.
Name macros so the conversion is obvious,
use common prefix for easier completion.
- GET_INT_FROM_POINTER -> POINTER_AS_INT
- SET_INT_IN_POINTER -> POINTER_FROM_INT
- GET_UINT_FROM_POINTER -> POINTER_AS_UINT
- SET_UINT_IN_POINTER -> POINTER_FROM_UINT
This merges changes in internals, runtime-only of existing custom
normals code, which make sense as of themselves, and will make diff of
soc branch easier/lighter to review.
In the details, it mostly changes two things:
* Now, smooth fans (aka MLoopNorSpaceArray) can store either loop
indices, or pointers to BMLoop themselves. This makes sense since in
BMesh, it's relatively easy to get index from a BMElement, but nearly
impracticable to go the other way around.
* First change enforces another, now we cannot rely anymore on `loops`
being NULL in MLoopNorSpace to detect single-loop fans, so we instead
store that info in a new flag.
Again, these are expected to be totally non-functional changes.
When you were using autosmooth to generate some custom normals, and
created empty custom loop normal data, you would go back to an 'all
smooth' shading, cancelling some sharp edges generated by the mesh's
smooth threshold.
Now we will first tag such edges as sharp, such that shading remains the
same. This is not crucial in current master, but it is for clnors
editing gsoc branch!
Previously quads always split along first-third vertices.
This is still the default, to avoid flickering with animated deformation
however concave quads that would create two opposing triangles now use
second-fourth split.
Reported as T53999 although this issue has been known limitation
for a long time.
Solves these security issues from T52924:
CVE-2017-12081
CVE-2017-12082
CVE-2017-12086
CVE-2017-12099
CVE-2017-12100
CVE-2017-12101
CVE-2017-12105
While the specific overflow issue may be fixed, loading the repro .blend
files may still crash because they are incomplete and corrupt. The way
they crash may be impossible to exploit, but this is difficult to prove.
Differential Revision: https://developer.blender.org/D3002
Now all the fine-tuning is happening using parallel range settings structure,
which avoid passing long lists of arguments, allows extend fine-tuning further,
avoid having lots of various functions which basically does the same thing.
We tried to do as much as possible in a single threaded callback, which
lead to using some nasty tricks like fake atomic-based spinlocks to
perform some operations (like float addition, which has no atomic
intrinsics).
While OK with 'standard' low number of working threads (8-16), because
collision were rather rare and implied memory barrier not *that* much
overhead, this performed poorly with more powerful systems reaching the
100 of threads and beyond (like workstations or render farm hardware).
There, both memory barrier overhead and more frequent collisions would
have significant impact on performances.
This was addressed by splitting further the process, we now have three
loops, one over polys, loops and vertices, and we added an intermediate
storage for weighted loop normals. This allows to avoid completely any
atomic operation in body of threaded loops, which should fix scalability
issues. This costs us slightly higher temp memory usage (something like
50Mb per million of polygons on average), but looks like acceptable
tradeoff.
Further more, tests showed that we could gain an additional ~7% of speed
in computing normals of heavy meshes, by also parallelizing the last two
loops (might be 1 or 2% on overall mesh update at best...).
Note that further tweaking in this code should be possible once Sergey
adds the 'minimum batch size' option to threaded foreach API, since very
light loops like the one on loops (mere v3 addition) require much bigger
batches than heavier code (like the one on polys) to keep optimal
performances.