Goals:
* Better high level control over where devirtualization occurs. There is always
a trade-off between performance and compile-time/binary-size.
* Simplify using array devirtualization.
* Better performance for cases where devirtualization wasn't used before.
Many geometry nodes accept fields as inputs. Internally, that means that the
execution functions have to accept so called "virtual arrays" as inputs. Those
can be e.g. actual arrays, just single values, or lazily computed arrays.
Due to these different possible virtual arrays implementations, access to
individual elements is slower than it would be if everything was just a normal
array (access does through a virtual function call). For more complex execution
functions, this overhead does not matter, but for small functions (like a simple
addition) it very much does. The virtual function call also prevents the compiler
from doing some optimizations (e.g. loop unrolling and inserting simd instructions).
The solution is to "devirtualize" the virtual arrays for small functions where the
overhead is measurable. Essentially, the function is generated many times with
different array types as input. Then there is a run-time dispatch that calls the
best implementation. We have been doing devirtualization in e.g. math nodes
for a long time already. This patch just generalizes the concept and makes it
easier to control. It also makes it easier to investigate the different trade-offs
when it comes to devirtualization.
Nodes that we've optimized using devirtualization before didn't get a speedup.
However, a couple of nodes are using devirtualization now, that didn't before.
Those got a 2-4x speedup in common cases.
* Map Range
* Random Value
* Switch
* Combine XYZ
Differential Revision: https://developer.blender.org/D14628
- Missing star prefix.
- Unnecessary indentation.
- Blank line after dot-points
(otherwise doxygen merges with the previous dot-point).
- Use back-slash for doxygen commands.
- Correct spelling.
This implements two optimizations:
* Reduce virtual function call overhead when a non-standard virtual
array is used as input.
* Use a lambda in `type_conversion.cc`.
In my test setup, which creates a float attribute filled with the index,
the running time drops from `4.0 ms` to `2.0 ms`.
Differential Revision: https://developer.blender.org/D14585
This extracts the inner loops into a separate function.
There are two main reasons for this:
* Allows using `__restrict` to indicate that no other parameter
aliases with the output array. This allows for better optimization.
* Makes it easier to search for the generated assembly code,
especially with the `BLI_NOINLINE`.
This simplifies debugging, and can help improve performance
by making it easier for the compiler.
More optimization might still be possible by using `__restrict` in
a few places.
Use a shorter/simpler license convention, stops the header taking so
much space.
Follow the SPDX license specification: https://spdx.org/licenses
- C/C++/objc/objc++
- Python
- Shell Scripts
- CMake, GNUmakefile
While most of the source tree has been included
- `./extern/` was left out.
- `./intern/cycles` & `./intern/atomic` are also excluded because they
use different header conventions.
doc/license/SPDX-license-identifiers.txt has been added to list SPDX all
used identifiers.
See P2788 for the script that automated these edits.
Reviewed By: brecht, mont29, sergey
Ref D14069
Previously, the function names were stored in `std::string` and were often
created dynamically (especially when the function just output a constant).
This resulted in a lot of overhead.
Now the function name is just a `const char *` that should be statically
allocated. This is good enough for the majority of cases. If a multi-function
needs a more dynamic name, it can override the `MultiFunction::debug_name`
method.
In my test file with >400,000 simple math nodes, the execution time improves from
3s to 1s.
Since fields were committed to master, socket inspection did
not work correctly for all socket types anymore. Now the same
functionality as before is back. Furthermore, fields that depend
on some input will now show the inputs in the socket inspection.
I added support for evaluating constant fields more immediately.
This has the benefit that the same constant field is not evaluated
more than once. It also helps with making the field independent
of the multi-functions that it uses. We might still want to change
the ownership handling for the multi-functions of nodes a bit,
but that can be done separately.
Differential Revision: https://developer.blender.org/D12444
This implements the initial core framework for fields and anonymous
attributes (also see T91274).
The new functionality is hidden behind the "Geometry Nodes Fields"
feature flag. When enabled in the user preferences, the following
new nodes become available: `Position`, `Index`, `Normal`,
`Set Position` and `Attribute Capture`.
Socket inspection has not been updated to work with fields yet.
Besides these changes at the user level, this patch contains the
ground work for:
* building and evaluating fields at run-time (`FN_fields.hh`) and
* creating and accessing anonymous attributes on geometry
(`BKE_anonymous_attribute.h`).
For evaluating fields we use a new so called multi-function procedure
(`FN_multi_function_procedure.hh`). It allows composing multi-functions
in arbitrary ways and supports efficient evaluation as is required by
fields. See `FN_multi_function_procedure.hh` for more details on how
this evaluation mechanism can be used.
A new `AttributeIDRef` has been added which allows handling named
and anonymous attributes in the same way in many places.
Hans and I worked on this patch together.
Differential Revision: https://developer.blender.org/D12414
* Reduce code duplication.
* Give methods more standardized names (e.g. `move_to_initialized` -> `move_assign`).
* Support wrapping arbitrary C++ types, even those that e.g. are not copyable.
In some multi-functions (such as a simple add function), the virtual method
call overhead to access array elements adds significant overhead. For these
simple functions it makes sense to generate optimized versions for different
types of virtual arrays. This is done by giving the compiler all the information
it needs to devirtualize virtual arrays.
In my benchmark this speeds up processing a lot of data with small function 2-3x.
This devirtualization should not be done for larger functions, because it increases
compile time and binary size, while providing a negilible performance benefit.
Previously, the signature of a `MultiFunction` was always embedded into the function.
There are two issues with that. First, `MFSignature` is relatively large, because it contains
multiple strings and vectors. Secondly, constructing it can add overhead that should not
be necessary, because often the same signature can be reused.
The solution is to only keep a pointer to a signature in `MultiFunction` that is set during
construction. Child classes are responsible for making sure that the signature lives
long enough. In most cases, the signature is either embedded into the child class or
it is allocated statically (and is only created once).
When a function is executed for many elements (e.g. per point) it is often the case
that some parameters are different for every element and other parameters are
the same (there are some more less common cases). To simplify writing such
functions one can use a "virtual array". This is a data structure that has a value
for every index, but might not be stored as an actual array internally. Instead, it
might be just a single value or is computed on the fly. There are various tradeoffs
involved when using this data structure which are mentioned in `BLI_virtual_array.hh`.
It is called "virtual", because it uses inheritance and virtual methods.
Furthermore, there is a new virtual vector array data structure, which is an array
of vectors. Both these types have corresponding generic variants, which can be used
when the data type is not known at compile time. This is typically the case when
building a somewhat generic execution system. The function system used these virtual
data structures before, but now they are more versatile.
I've done this refactor in preparation for the attribute processor and other features of
geometry nodes. I moved the typed virtual arrays to blenlib, so that they can be used
independent of the function system.
One open question for me is whether all the generic data structures (and `CPPType`)
should be moved to blenlib as well. They are well isolated and don't really contain
any business logic. That can be done later if necessary.
This replaces header include guards with `#pragma once`.
A couple of include guards are not removed yet (e.g. `__RNA_TYPES_H__`),
because they are used in other places.
This patch has been generated by P1561 followed by `make format`.
Differential Revision: https://developer.blender.org/D8466
This updates the usage of integer types in code I wrote according to our new style guides.
Major changes:
* Use signed instead of unsigned integers in many places.
* C++ containers in blenlib use `int64_t` for size and indices now (instead of `uint`).
* Hash values for C++ containers are 64 bit wide now (instead of 32 bit).
I do hope that I broke no builds, but it is quite likely that some compiler reports
slightly different errors. Please let me know when there are any errors. If the fix
is small, feel free to commit it yourself.
I compiled successfully on linux with gcc and on windows.