From f51574535b764595130097b8880b664d4159e512 Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sat, 17 Feb 2024 21:37:05 +0100 Subject: [PATCH 01/14] initial pages --- docs/features/core/data_structures/any.md | 1 + docs/features/core/data_structures/bits.md | 3 +++ docs/features/core/data_structures/containers.md | 3 +++ docs/features/core/data_structures/cpp_type.md | 1 + docs/features/core/data_structures/disjoint_set.md | 1 + docs/features/core/data_structures/functions.md | 1 + docs/features/core/data_structures/index.md | 3 +++ docs/features/core/data_structures/index_mask.md | 1 + docs/features/core/data_structures/index_range.md | 1 + docs/features/core/data_structures/spans.md | 1 + docs/features/core/data_structures/strings.md | 3 +++ docs/features/core/data_structures/virtual_array.md | 1 + 12 files changed, 20 insertions(+) create mode 100644 docs/features/core/data_structures/any.md create mode 100644 docs/features/core/data_structures/bits.md create mode 100644 docs/features/core/data_structures/containers.md create mode 100644 docs/features/core/data_structures/cpp_type.md create mode 100644 docs/features/core/data_structures/disjoint_set.md create mode 100644 docs/features/core/data_structures/functions.md create mode 100644 docs/features/core/data_structures/index.md create mode 100644 docs/features/core/data_structures/index_mask.md create mode 100644 docs/features/core/data_structures/index_range.md create mode 100644 docs/features/core/data_structures/spans.md create mode 100644 docs/features/core/data_structures/strings.md create mode 100644 docs/features/core/data_structures/virtual_array.md diff --git a/docs/features/core/data_structures/any.md b/docs/features/core/data_structures/any.md new file mode 100644 index 00000000..868e9bfd --- /dev/null +++ b/docs/features/core/data_structures/any.md @@ -0,0 +1 @@ +# Any diff --git a/docs/features/core/data_structures/bits.md b/docs/features/core/data_structures/bits.md new file mode 100644 index 00000000..37f2775c --- /dev/null +++ b/docs/features/core/data_structures/bits.md @@ -0,0 +1,3 @@ +# Bits + +`BitVector` diff --git a/docs/features/core/data_structures/containers.md b/docs/features/core/data_structures/containers.md new file mode 100644 index 00000000..d17303fc --- /dev/null +++ b/docs/features/core/data_structures/containers.md @@ -0,0 +1,3 @@ +# Containers + +`Array/Vector/Stack/Set/Map/VectorSet` diff --git a/docs/features/core/data_structures/cpp_type.md b/docs/features/core/data_structures/cpp_type.md new file mode 100644 index 00000000..275e5fb7 --- /dev/null +++ b/docs/features/core/data_structures/cpp_type.md @@ -0,0 +1 @@ +# CPP Type diff --git a/docs/features/core/data_structures/disjoint_set.md b/docs/features/core/data_structures/disjoint_set.md new file mode 100644 index 00000000..9237480f --- /dev/null +++ b/docs/features/core/data_structures/disjoint_set.md @@ -0,0 +1 @@ +# Disjoint Set diff --git a/docs/features/core/data_structures/functions.md b/docs/features/core/data_structures/functions.md new file mode 100644 index 00000000..0c5faf50 --- /dev/null +++ b/docs/features/core/data_structures/functions.md @@ -0,0 +1 @@ +# Functions diff --git a/docs/features/core/data_structures/index.md b/docs/features/core/data_structures/index.md new file mode 100644 index 00000000..7995636b --- /dev/null +++ b/docs/features/core/data_structures/index.md @@ -0,0 +1,3 @@ +# Data Structures + +Core low-level data structures used throughout Blender. The documentation here is supposed to give a birds eye view over the available data structures and when to use them. More details for all available methods can be found in the source code. diff --git a/docs/features/core/data_structures/index_mask.md b/docs/features/core/data_structures/index_mask.md new file mode 100644 index 00000000..b66aaedd --- /dev/null +++ b/docs/features/core/data_structures/index_mask.md @@ -0,0 +1 @@ +# Index Mask diff --git a/docs/features/core/data_structures/index_range.md b/docs/features/core/data_structures/index_range.md new file mode 100644 index 00000000..9aa3d59d --- /dev/null +++ b/docs/features/core/data_structures/index_range.md @@ -0,0 +1 @@ +# Index Range diff --git a/docs/features/core/data_structures/spans.md b/docs/features/core/data_structures/spans.md new file mode 100644 index 00000000..efb4f458 --- /dev/null +++ b/docs/features/core/data_structures/spans.md @@ -0,0 +1 @@ +# Span diff --git a/docs/features/core/data_structures/strings.md b/docs/features/core/data_structures/strings.md new file mode 100644 index 00000000..627838de --- /dev/null +++ b/docs/features/core/data_structures/strings.md @@ -0,0 +1,3 @@ +# Strings + +`std::string`, `StringRef`, `StringRefNull` diff --git a/docs/features/core/data_structures/virtual_array.md b/docs/features/core/data_structures/virtual_array.md new file mode 100644 index 00000000..6b9ab6d1 --- /dev/null +++ b/docs/features/core/data_structures/virtual_array.md @@ -0,0 +1 @@ +# Virtual Arrays -- 2.30.2 From 90a34b159bb97b9aff5b44c5399ebf61d17284da Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 13:11:14 +0100 Subject: [PATCH 02/14] add container docs --- .../core/data_structures/containers.md | 217 +++++++++++++++++- 1 file changed, 216 insertions(+), 1 deletion(-) diff --git a/docs/features/core/data_structures/containers.md b/docs/features/core/data_structures/containers.md index d17303fc..75d1b59e 100644 --- a/docs/features/core/data_structures/containers.md +++ b/docs/features/core/data_structures/containers.md @@ -1,3 +1,218 @@ # Containers -`Array/Vector/Stack/Set/Map/VectorSet` +[Container](https://en.wikipedia.org/wiki/Container_(abstract_data_type)) data structures allow storing many elements of the same type. Different structures in this category allow for different access patterns. + +## Vector + +The `blender::Vector` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_vector.hh)) is the most important data structure. It stores values of the given type in a dynamically growing continuous buffer. + +```cpp +/* Create an empty vector. */ +Vector values; + +/* Add an element to the end of the vector. */ +values.append(5); + +/* Access an element at the given index. */ +int value = values[0]; +``` + +## Array + +`blender::Array` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_array.hh)) is very similar to `Vector`. The main difference is that it is not dynamically growing. Instead its size is usually only set once and stays the same for the rest of its life-time. It has a slightly lower memory footprint than `Vector`. Using an `Array` instead of `Vector` also indicates that the size is not expected to change. + +Note that this is different from `std::array` for which the size has to be known at compile time. If the size is actually known at compile time, `std::array` should be used instead. + +## Stack + +`blender::Stack` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_stack.hh)) implements a first-in-last-out data structure. It does not allow accessing any element that is not at the top of the stack currently. + +```cpp +/* Construct an empty stack. */ +Stack stack; + +/* Add two elements to the stack. */ +stack.push(5); +stack.push(3); + +/* Look at the element at the top of the stack without removing it. */ +int value = stack.peek(); + +/* Remove the top element of the stack and return it. */ +int value = stack.pop(); +``` + +A `Vector` can also be used as a `Stack` using the `Vector::append`, `Vector::last` and `Vector::pop_last` methods. This is benefitial if one also needs the ability to iterate over all elements that are currently in the stack. If that's not required, it's better to use `Stack` directly because of it's more purpose-designed methods and it allows pushing in O(1) (not just armortized as it does not require reallocating already pushed elements). + +## Set + +A `blender::Set` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_set.hh)) stores values of type `T` while making sure that there are no duplicates. The elements added to a `Set` have no particular order. + +```cpp +/* Construct an empty set. */ +Set values; + +/* Add two elements. */ +values.add(5); +values.add(3); + +/* Adding the same element again has no effect. */ +values.add(5); + +/* Add value that is known not to be in the set already. */ +/* Makes intend more obvious, allows for better asserts and performance. */ +values.add_new(10); + +/* Remove an element. */ +values.remove(5); + +/* Check if an element is contained in the set. */ +bool is_contained = values.contains(3); +``` + +Using `Set` with a custom type requires an equality operator and a [hash](#hashing) function. + +While a `Vector` could also be used to mimic the behavior of a `Set`, it's generally much less efficient at that task. A `Set` uses a hash table internally which allows it to check for duplicates in constant time instead of having to compare the value with every previously added element. + +## Map + +A `blender::Map` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_map.hh)) stores values that can be looked up by a key. Every key can exist at most once. The key-value-pairs have no particular order. + +```cpp +/* Construct an empty map. */ +Map values; + +/* Add key-value-pair. */ +values.add(2, "two"); + +/* Adding the same key again does *not* effect the map. */ +values.add(2, "TWO"); + +/* Add a new value for the existing key. */ +values.add_overwrite(2, "Two"); + +/* Add a value that is known not to be in the map already. */ +/* Makes intend more obvious, allows for better asserts and performance. */ +values.add_new(3, "three"); + +/* Check if a key exists in the map. */ +bool exists = values.contains(2); + +/* Remove a key-value-pair if it exists. */ +values.remove(2); + +/* Get the value for a given key. */ +/* This fails if the key does not exist. */ +std::string &value = values.lookup(3); + +/* Get the value for a given key and return a default value if it's not in the map. */ +std::string value = values.lookup_default(4, "unknown"); + +/* Lookup the value or add a new value if it doesn't exist yet. */ +/* This can be used to implement maps with default-values. */ +/* Always adds an 'a' to the string corresponding to the key 4. */ +values.lookup_or_add(4, "") += "a"; + +/* Iterate over all keys. */ +for (const int key : values.keys()) { /* ... */ } + +/* Iterate over all values. */ +for (const std::string &value : keys.values()) { /* ... */ } + +/* Iterate over all key-value-pairs. */ +for (const MapItem item : keys.items()) { + do_something(item.key, item.value); +} +``` + +Using `Map` with a custom type as key requires an equality operator and a [hash](#hashing) function. + +## Vector Set + +A `blender::VectorSet` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_vector_set.hh)) is a combination of a `Vector` and a `Set`. It can't contain duplicate values like a `Set` but the values stored in it are ordered based on insertion order (until elements are removed). Just like in a `Vector`, the values are also stored in a continuous array which makes it easy to pass them to other functions that expect an array. + +```cpp +/* Construct empty vector-set. */ +VectorSet values; + +/* Add elements. Duplicate values are ignored. */ +values.add(5); +values.add(5); +values.add(6); + +/* Check if value is contained. */ +bool is_contained = values.contains(5); + +/* Get index of value. */ +int index = values.index_of(5); /* = 0 */ +int index = values.index_of(6); /* = 1 */ + +/* Access value by index. */ +int value = values[1]; /* = 6 */ +``` + +Using `VectorSet` with a custom type requires an equality operator and a [hash](#hashing) function. + +## Common Concepts + +These are some concepts that apply to many of the container types. + +### Inline Buffers + +Most container types mentioned above (except `VectorSet` currently) have an inline buffer. For as long as the elements added to the container fit into the inline buffer, no additional allocation is made. This is important because allocations can be a performance bottleneck. + +Inline buffers are enabled by default in supported containers. It's generally recommended to use the default value, but there are cases when the inline buffer size should be set manually. +* When building a compact type which has a container that is usually empty, the inline buffer size could be set to 0. It should also be considered to just wrap the container in a `std::unique_ptr` in this case. +* When working in hot code that requires e.g. a `Vector`, the inline buffer size can be increased to make better use of stack memory and to avoid allocations in the majority of cases. + +The inline buffer is is typically the first template parameter after the type. + +```cpp +/* Construct vector that can hold up to 32 ints without extra memory. */ +Vector vec; + +/* Same as above, but for other container types. */ +Array array; +Stack stack; +Set set; +Map map; +``` + +Using a larger online buffer obviously also increases the size of the type: `sizeof(Vector) < sizeof(Vector)`. + +### Hashing + +Using custom types in a data structure that uses a hash table (like `Set`, `Map` and `VectorSet`) requires the implementation of the equality operator (`operator==`) and a `hash` function. + +```cpp +struct MyType { + int x, y; + + /* Return true when both values are considered equal. */ + friend bool operator==(const MyType &a, const MyType &b) { + return /* ... */ + } + + uint64_t hash() const { + return get_default_hash(this->x, this->y); + } +}; +``` + +A potentially more convenient way to implement the equality operator could be to use `BLI_STRUCT_EQUALITY_OPERATORS_2`. + +The `hash` function has to return the same value for two instances of the type that are considered equal. In theory, even returning a constant value fullfills that requirement and would be correct. However, having a hash function that always returns the same value makes using hash tables useless and degrades performance. + +Simply calling `get_default_hash` on the data members that should impact the hash (which are the same ones which should impact equality) is usually good enough. When designing a custom hash function, it's recommended to put as much variation as possible into the lower bits. Those are used by the containers at first. However, if the low bits have too many collisions, the higher bits are taken into account automatically as well. + +It's also possible to implement a hash function for a type that methods can't be added to (e.g. because it comes from an external separate library). This can be done by specializing the `blender::DefaultHash` struct. For more information look at the [source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_hash.hh). + +### Method Overloads and `*_as` Methods + +Many core methods on the container data structures have multiple overloads. For example, `Vector::append` has two versions. One that takes a `const T &` and one that takes a `T &&` parameter. Those are used to properly handle the cases when the value is copied or moved into the container. + +Additionally, many methods have a variant with the `_as` suffix. Such methods allow passing in the parameter with a different type than what the container actually contains. For example, `Vector::append_as` can also be called with a `const char *` parameter, and not just `std::string`. The `std::string` is then constructed inplace. This avoids the need to construct it first and then to move it in the right place. This specific example is very similar to `std::vector::emplace_back`. + +However, the `*_as` convention is a bit more general. For example, it allows calling `Set::add_as` or `Set::contains_as` to be called with a type that is not exactly the one which is stored. This avoids the construction of stored type in many cases. + +Other libraries sometimes support this as well, without the additional `_as` suffix, but that also leads to more complex error messages in the common cases. -- 2.30.2 From a9e41558a4f515a95054d1cf1bad80fca5e12a9c Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 13:24:39 +0100 Subject: [PATCH 03/14] functions --- .../core/data_structures/functions.md | 46 +++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/docs/features/core/data_structures/functions.md b/docs/features/core/data_structures/functions.md index 0c5faf50..9f4bbc12 100644 --- a/docs/features/core/data_structures/functions.md +++ b/docs/features/core/data_structures/functions.md @@ -1 +1,47 @@ # Functions + +There are many ways to store a reference to a function. This document gives an overview over those and gives recommendations for when to use which approach. + +1. Pass function pointer and user data (as void *) separately: + - The only method that is compatible with C interfaces. + - Is cumbersome to work with in many cases, because one has to keep track of two parameters. + - Not type safe at all, because of the void pointer. + - It requires workarounds when one wants to pass a lambda into a function. +2. Using `std::function`: + - It works well with most callables and is easy to use. + - Owns the callable, so it can be returned from a function more safely than other methods. + - Requires that the callable is copyable. + - Requires an allocation when the callable is too large (typically > 16 bytes). +3. Using a template for the callable type: + - Most efficient solution at runtime, because compiler knows the exact callable at the place + where it is called. + - Works well with all callables. + - Requires the function to be in a header file. + - It's difficult to constrain the signature of the function. +4. Using `blender::FunctionRef` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_function_ref.hh)): + - Second most efficient solution at runtime. + - It's easy to constrain the signature of the callable. + - Does not require the function to be in a header file. + - Works well with all callables. + - It's a non-owning reference, so it *cannot* be stored safely in general. + +The following diagram helps to decide which approach to use when building an API where the user has to pass in a function. + +```mermaid +flowchart TD + is_called_from_c["Is called from C code?"] + is_stored_when_function_ends["Is stored when function ends?"] + is_call_performance_bottleneck["Call is performance bottleneck?"] + use_function_pointer["Use function pointer and user data (approach 1)."] + use_std_function["Use `std::function` (approach 2)."] + use_template["Use `template` (approach 3)."] + use_function_ref["Use `blender::FunctionRef` (approach 4)."] + + + is_called_from_c --"yes"--> use_function_pointer + is_called_from_c --"no"--> is_stored_when_function_ends + is_stored_when_function_ends --"yes"--> use_std_function + is_stored_when_function_ends --"no"--> is_call_performance_bottleneck + is_call_performance_bottleneck --"no"--> use_function_ref + is_call_performance_bottleneck --"yes"--> use_template +``` -- 2.30.2 From da0a219b62033a384db57abc2f1ce66038c81e2d Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 13:41:22 +0100 Subject: [PATCH 04/14] any --- docs/features/core/data_structures/any.md | 57 +++++++++++++++++++++++ 1 file changed, 57 insertions(+) diff --git a/docs/features/core/data_structures/any.md b/docs/features/core/data_structures/any.md index 868e9bfd..7f024c0d 100644 --- a/docs/features/core/data_structures/any.md +++ b/docs/features/core/data_structures/any.md @@ -1 +1,58 @@ # Any + +`blender::Any` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_any.hh)) is a type-safe container for single values of any copy-constructible type. It is similar to `std::any` but provides the following two additional features: +- It has adjustable inline buffer capacity and alignment. `std::any` typically has a small inline buffer but its size is not guaranteed. +- It can store additional user-defined type information without increasing the `sizeof` the `Any` object. + +If any of those features is required, it's benefitial to use `blender::Any`. Otherwise using `std::any` is fine as well. + +```cpp +/* Construct empty value. */ +Any<> value; + +/* Check if there is any value inside. */ +bool contains_value = value.has_value(); + +/* Assign values of different types. Each assignment overwrites the previous one. */ +value = 4; +value = "four"; +value = 4.0f; + +/* Check if a specific type is stored. */ +bool is_type = value.is(); + +/* Access the value with a specific type. */ +/* This only works if the stored value actually has the given type. */ +float value = value.get(); + +/* Get a pointer to the value without knowing the type. */ +void *value = value.get(); + +/* Remove the value if there is any. */ +value.reset(); +``` + +## Store Additional Type Information + +One of the features of `blender::Any` is that it can store additional information for each type that is stored. This can be done with fairly low overhead, because `Any` does type erasure and has to store some type specific information anyway. + +In the example below, the `blender::Any` knows the size of the stored type. + +```cpp +/* Struct that contains all the information that should be stored for the type. */ +struct ExtraSizeInfo { + size_t size; + + /* Function that constructs the extra info for a given type. */ + template static constexpr ExtraSizeInfo get() + { + return {sizeof(T)}; + } +}; + +/* Construct the Any with the integer 5 stored in it. */ +Any value = 5; + +/* Access the size of the stored type. */ +int stored_size = value.extra_info().size; /* = 4 */ +``` -- 2.30.2 From f03613581c3f574c35f1849825083278cc8977ec Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 14:28:32 +0100 Subject: [PATCH 05/14] bits --- docs/features/core/data_structures/bits.md | 106 +++++++++++++++++- .../core/data_structures/containers.md | 4 +- 2 files changed, 107 insertions(+), 3 deletions(-) diff --git a/docs/features/core/data_structures/bits.md b/docs/features/core/data_structures/bits.md index 37f2775c..76b5be49 100644 --- a/docs/features/core/data_structures/bits.md +++ b/docs/features/core/data_structures/bits.md @@ -1,3 +1,107 @@ # Bits -`BitVector` +Sometimes it can be benefitial to work with bits directly instead of boolean values because they are very compact and many bits can be processed at the same time. This document shows some available utilities to work with dynamically sized bitsets. + +Before choosing to work with individual bits instead of bools, keep in mind that there are also downsides which may not be obvious at first. +- Writing to separate bits in the same int is not thread-safe. Therefore, an existing vector of + bool can't easily be replaced with a bit vector, if it is written to from multiple threads. + Read-only access from multiple threads is fine though. +- Writing individual elements is more expensive when the array is in cache already. That is + because changing a bit is always a read-modify-write operation on the int the bit resides in. +- Reading individual elements is more expensive when the array is in cache already. That is + because additional bit-wise operations have to be applied after the corresponding int is + read. + +## BitVector + +The most common type to use when working with bits is `blender::BitVector` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_bit_vector.hh)). It's a dynamically growing contiguous array of bits. As such it has similarities to `blender::Vector`, but also `std::vector`. In contrast to both, `BitVector` has an API that is more optimized for dealing with bits. + +Just like `blender::Vector`, `BitVector` also supports an inline buffer. This is especially benefitial here, because many bits can be stored directly in the `BitVector` without significantly increasing its size. + +```cpp +/* Construct bit vector with 500 bits which a false/0 by default. */ +BitVector<> values(500, false); + +/* Setting a bit. */ +values[10].set(); + +/* Setting a bit to a specific value. */ +values[10].set(true); + +/* Resetting a bit. */ +values[10].reset(); + +/* Check if a bit is set. */ +values[10].test(); + +/* It's also possible to use implicit conversions instead. */ +if (values[10]) { /* ... */ } +``` + +## BitRef + +It's not possible to have a pointer or reference to specific bit with standard C++. Instead, there are the `BitRef` and `MutableBitRef` types ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_bit_ref.hh)) which can reference individual bits. + +Those are also the types returned when accessing a specific index in `BitVector`. + +## BitSpan + +Just like it's not possible to reference a single bit with standard C++, it's also not possible to reference a span of bits. To do that, one can use `BitSpan` and `MutableBitSpan` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_bit_span.hh)). + +Additionally, there is also `BoundedBitSpan` and `MutableBoundedBitSpan`. Those are like normal bit spans but enforce specific constraints on the alignment of the span. These additional constraints allow bit spans to be processed more efficiently than in the most general constraint. For more details on the exact constraints, check the `is_bounded_span` function. + +It's generally recommended to work with bit spans that follow these constraints if possible for best performance. + +## BitSpan Operations + +There are three core operations that can be performed on bit spans: +1. Mix multiple bit spans together in some way and store the result in another bit span. +2. Check if any bit is set. +3. Iterate over all set bits. + +`BLI_bit_span_ops.hh` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_bit_span_ops.hh)) offers utilities to these things. + +```cpp +BitVector<> vec1(500); +BitVector<> vec2(500); +BitVector<> result(500); + +/* Or vec1 into result. */ +result |= vec1; + +/* Invert result bits. */ +bits::invert(result); + +/* Check if two bit spans have common bits set. */ +bits::has_common_set_bits(vec1, vec2); + +/* Iterate over set bits. */ +bits::foreach_1_index(result, [&](const int64_t i) { /* ... */ }); + +/* Perform custom function on bits. */ +bits::mix_into_first_expr([](bits::BitInt result, + bits::BitInt vec1, + bits::BitInt vec2) { return result ^ (vec1 | vec2); }, + result, + vec1, + vec2); +``` + +## BitGroupVector + +`BitGroupVector` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_bit_group_vector.hh)) allows storing a fixed number of bits for each element. For example, this could be used to store a bit for each attribute for each vertex in a mesh. + +In some sense, this data structure is also 2D bit vector, that can dynamically grow on one axis. + +`BitGroupVector` is designed so that the individual bit groups all fullfill the requirements by bounded bit spans. As such, they can be processed efficiently. + +```cpp +/* Store a bit for each attribute for each vertex. */ +BitGroupVector<> bits(verts_num, attributes_num, false); + +/* Set all bits for on vertex to 1. */ +bits[vert_index].set_all(); + +/* Set bit for a specific vertex and attribute. */ +bits[vert_index][attribute_index].set(); +``` diff --git a/docs/features/core/data_structures/containers.md b/docs/features/core/data_structures/containers.md index 75d1b59e..107efd1b 100644 --- a/docs/features/core/data_structures/containers.md +++ b/docs/features/core/data_structures/containers.md @@ -4,7 +4,7 @@ ## Vector -The `blender::Vector` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_vector.hh)) is the most important data structure. It stores values of the given type in a dynamically growing continuous buffer. +The `blender::Vector` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_vector.hh)) is the most important data structure. It stores values of the given type in a dynamically growing contiguous buffer. ```cpp /* Create an empty vector. */ @@ -129,7 +129,7 @@ Using `Map` with a custom type as key requires an equality operator and a [hash] ## Vector Set -A `blender::VectorSet` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_vector_set.hh)) is a combination of a `Vector` and a `Set`. It can't contain duplicate values like a `Set` but the values stored in it are ordered based on insertion order (until elements are removed). Just like in a `Vector`, the values are also stored in a continuous array which makes it easy to pass them to other functions that expect an array. +A `blender::VectorSet` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_vector_set.hh)) is a combination of a `Vector` and a `Set`. It can't contain duplicate values like a `Set` but the values stored in it are ordered based on insertion order (until elements are removed). Just like in a `Vector`, the values are also stored in a contiguous array which makes it easy to pass them to other functions that expect an array. ```cpp /* Construct empty vector-set. */ -- 2.30.2 From cbbc822500888a8dd39b6eda3eea3ce86486b1f0 Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 14:43:30 +0100 Subject: [PATCH 06/14] cpp type --- .../features/core/data_structures/cpp_type.md | 28 +++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/docs/features/core/data_structures/cpp_type.md b/docs/features/core/data_structures/cpp_type.md index 275e5fb7..ec1ea615 100644 --- a/docs/features/core/data_structures/cpp_type.md +++ b/docs/features/core/data_structures/cpp_type.md @@ -1 +1,29 @@ # CPP Type + +The `CPPType` class ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_cpp_type.hh)) allows working with arbitrary C++ types in a generic way. An instance of `CPPType` wraps exactly one type like `int` or `std::string`. + +`CPPType` is mostly concerned with properly constructing, destructing, copying and moving types, instead of doing more type specific operations (like addition). As such, it can be used to implement containers whose data type is only known at run-time like `GArray` and `GSpan` (where G means "generic"). + +Typically, performance sensitive sections are still implemented in specialized and often templated code. However, the higher level "data logistics" (getting the right data to the right place) can be implemented generically without the need for templates. + +```cpp +/* A generic swap implementation that works with all types which have a CPPType. */ +void generic_swap(void *a, void *b, const CPPType &type) { + BUFFER_FOR_CPP_TYPE_VALUE(type, buffer) + type.move_construct(a, buffer); + type.move_assign(b, a); + type.move_assign(buffer, b); + type.destruct(buffer); +} + +/* Initialize values. */ +std::string a = "A"; +std::string b = "B"; + +/* Get the CPPType for std::string. */ +/* This only works when this CPPType has been defined elsewhere using BLI_CPP_TYPE_MAKE. */ +const CPPType &type = CPPType::get(); + +/* Swap values in a and b. */ +generic_swap(&a, &b, type); +``` -- 2.30.2 From a223a884cf50913dae9cce5ccff1e6c0cf9b32f1 Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 14:52:33 +0100 Subject: [PATCH 07/14] disjoint set --- .../core/data_structures/disjoint_set.md | 22 ++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/docs/features/core/data_structures/disjoint_set.md b/docs/features/core/data_structures/disjoint_set.md index 9237480f..d590121b 100644 --- a/docs/features/core/data_structures/disjoint_set.md +++ b/docs/features/core/data_structures/disjoint_set.md @@ -1 +1,21 @@ -# Disjoint Set +# Disjoint-Set + +A [disjoint-set](https://en.wikipedia.org/wiki/Disjoint-set_data_structure) data structure can be used to find connected and disconnected pieces in a graph efficiently. A typical use-case in Blender is to detect mesh islands. + +Blender currently has two implementations of this data structure. `AtomicDisjointSet` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_atomic_disjoint_set.hh)) which is thread-safe and `DisjointSet` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_disjoint_set.hh)) which is not. There is a small overhead to using the atomic version when multi-threading is not used. + +```cpp +/* Create disjoint set. Initially, every vertex is disjoint from the others. */ +DisjointSet disjoint_set(verts_num); + +/* Merge sets when there is an edge between them. */ +for (const int2 edge : edges) { + disjoint_set.join(edge[0], edge[1]); +} + +/* Check if two vertices are in the same set. */ +bool is_joined = disjoint_set.in_same_set(vert_1, vert_2); + +/* Get representative element for all vertices in an island. */ +int element = disjoint_set.find_root(vert_index); +``` -- 2.30.2 From 6cb79599522fbb5d24904d32fe4012a937bc2def Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 15:02:01 +0100 Subject: [PATCH 08/14] span --- docs/features/core/data_structures/spans.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/features/core/data_structures/spans.md b/docs/features/core/data_structures/spans.md index efb4f458..f607641b 100644 --- a/docs/features/core/data_structures/spans.md +++ b/docs/features/core/data_structures/spans.md @@ -1 +1,7 @@ # Span + +A `blender::Span` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_span.hh)) references an array that is owned by someone else. It is just a pointer and a size. Since it is so small, it should generally be passed by value. + +Using `Span` is the main way to pass multiple elements into a function and should be prefered over e.g. `const Vector &` because it gives the caller more flexibility. + +The memory directly referenced by the span is considered to be `const`. This is different from `std::span` where constness is not the default. When non-constness is required, a `MutableSpan` can be used. This also makes the intention more clear. -- 2.30.2 From 4547fac8e3fff92dff99a7c9cc1164fcb9842f8a Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 17:30:40 +0100 Subject: [PATCH 09/14] strings --- docs/features/core/data_structures/strings.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/features/core/data_structures/strings.md b/docs/features/core/data_structures/strings.md index 627838de..4418eed5 100644 --- a/docs/features/core/data_structures/strings.md +++ b/docs/features/core/data_structures/strings.md @@ -1,3 +1,3 @@ # Strings -`std::string`, `StringRef`, `StringRefNull` +Blender usually stores strings as `std::string`. If strings are passed around without transfering ownership `blender::StringRef` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_string_ref.hh)) should be used. `StringRef` is a non-owning slice of a string. It's just a pointer and a size and should generally be passed around by value. If a string with null-termination is required, `StringRefNull` should be used instead. -- 2.30.2 From 24576416a831fff4ab5c4778a51c17019c150895 Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 17:50:53 +0100 Subject: [PATCH 10/14] index range --- .../core/data_structures/index_range.md | 20 +++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/docs/features/core/data_structures/index_range.md b/docs/features/core/data_structures/index_range.md index 9aa3d59d..2368ad81 100644 --- a/docs/features/core/data_structures/index_range.md +++ b/docs/features/core/data_structures/index_range.md @@ -1 +1,21 @@ # Index Range + +An `IndexRange` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_index_range.hh)) represents a set of non-negative consecutive indices. It's stored as just the start index and the number of indices. Since it's small, it should generally be passed by value. + +The most common usage of `IndexRange` is to loop over indices. This is better than a c-style index loop, because it reduces the likelyhood of mixing up variables and allows the current index to be const. It's also just more convenient. + +```cpp +/* Iterate over the indices from 0 to 9. */ +for (const int64_t i : IndexRange(10)) { /* ... */ } +``` + +Most container data structures have an `.index_range()` method which makes it easy to iterate over all the indices in the container. + +```cpp +/* Iterate over indices in the container. */ +Vector vec = /* ... */; +for (const int64_t i : vec.index_range()) { /* ... */ } + +/* Iterate over the indices but skip the first 5. */ +for (const int64_t i : vec.index_range().drop_front(5)) { /* ... */ } +``` -- 2.30.2 From 8ff1c1d06acfe207f42923e07ac412ac51c2c487 Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 19:04:06 +0100 Subject: [PATCH 11/14] varray --- .../core/data_structures/virtual_array.md | 37 +++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/docs/features/core/data_structures/virtual_array.md b/docs/features/core/data_structures/virtual_array.md index 6b9ab6d1..8c3a9074 100644 --- a/docs/features/core/data_structures/virtual_array.md +++ b/docs/features/core/data_structures/virtual_array.md @@ -1 +1,38 @@ # Virtual Arrays + +A virtual array ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_virtual_array.hh)) is a data structure that behaves similarly to an array, but its elements are accessed through virtual methods. This improves the decoupling of a function from its callers, because it does not have to know exactly how the data is laid out in memory, or if it is stored in memory at all. It could just as well be computed on the fly. + +Taking a virtual array as parameter instead of a more specific non-virtual type has some tradeoffs. Access to individual elements of the individual elements is slower due to function call overhead. On the other hand, potential callers don't have to convert the data into the specific format required for the function. This can be a costly conversion if only few of the elements are accessed in the end. + +Functions taking a virtual array as input can still optimize for different data layouts. For example, they can check if the array references contiguous memory internally or if it is the same value for all indices. Whether it is worth optimizing for different data layouts in a function has to be decided on a case by case basis. One should always do some benchmarking to see if the increased compile time and binary size is worth it. + +```cpp +/* Create an empty virtual array. */ +VArray values; + +/* Create a virtual array that has the same value at every index. */ +VArray values = VArray::ForSingle(value, num_values); + +/* Create a virtual array that references references a span. */ +VArray values = VArray::ForSpan(int_span); + +/* Create a virtual array where the value at each index is twice the index. */ +VArray values = VArray::ForFunc(num_values, [](const int64_t i) { return i * 2; }); + +/* Get the value at a specific index. */ +int value = values[5]; + +/* Optimize for the case when the virtual array contains a single value. */ +if (const std::optional value = values.get_if_single()) { + /* Handle the case when all values are the same. */ +} +else { + /* Handle the general case whenvalues may be all different. */ +} + +/* Automatically generate a function multiple times for different data layouts. */ +devirtualize_varray(values, [&](auto values) { + /* The code in this lambda is optimized for three possible variants. */ + /* `values` can be a Span, a SingleAsSpan or a VArrayRef. */ +}); +``` -- 2.30.2 From 1c2a43fdccf46480648de103c62d2bb9910fc08c Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Sun, 18 Feb 2024 19:25:35 +0100 Subject: [PATCH 12/14] index mask --- .../features/core/data_structures/cpp_type.md | 2 +- .../core/data_structures/index_mask.md | 30 +++++++++++++++++++ .../core/data_structures/virtual_array.md | 3 +- 3 files changed, 33 insertions(+), 2 deletions(-) diff --git a/docs/features/core/data_structures/cpp_type.md b/docs/features/core/data_structures/cpp_type.md index ec1ea615..e116fd48 100644 --- a/docs/features/core/data_structures/cpp_type.md +++ b/docs/features/core/data_structures/cpp_type.md @@ -21,7 +21,7 @@ std::string a = "A"; std::string b = "B"; /* Get the CPPType for std::string. */ -/* This only works when this CPPType has been defined elsewhere using BLI_CPP_TYPE_MAKE. */ +/* This only works when BLI_CPP_TYPE_MAKE was used for the type. */ const CPPType &type = CPPType::get(); /* Swap values in a and b. */ diff --git a/docs/features/core/data_structures/index_mask.md b/docs/features/core/data_structures/index_mask.md index b66aaedd..44d78030 100644 --- a/docs/features/core/data_structures/index_mask.md +++ b/docs/features/core/data_structures/index_mask.md @@ -1 +1,31 @@ # Index Mask + +An `IndexMask` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_index_mask.hh)) is a sequence of unique and sorted indices. It's commonly used when a subset of elements in an array has to be processed. This is sometimes called [existential processing](https://www.dataorienteddesign.com/dodbook/node4.html) and is often better than having e.g. a bool for every element that has to be checked in every inner loop to determine if it has to be processed. + +Semantically, an `IndexMask` is very similar to a simple `Vector` with unique and sorted indices. However, due to the implementation details of `IndexMask`, it can be significantly more efficient than the `Vector`. + +An `IndexMask` does not own the memory it references. Typically, the referenced data is either statically allocated or is owned by an `IndexMaskMemory`. + +```cpp +/* Owner of some dynamically allocated memory in one or more index masks. */ +IndexMaskMemory memory; + +/* Construct index mask for indices. */ +const IndexMask mask = IndexMask::from_indices({10, 12, 13, 14, ...}, memory); + +/* Construct index mask from a boolean array. */ +const IndexMask mask = IndexMask::from_bools(IndexRange(100), bool_span, memory); + +/* Construct an index mask from a predicate. */ +/* In this case, the index mask will contain every third index. */ +/* The passed in grain size controls the level of parallelism. */ +const IndexMask mask = IndexMask::from_predicate( + IndexRange(1000), GrainSize(512), memory, + [](const int64_t i) { return i % 3 == 0; }); + +/* Iterate over all indices in an index mask. */ +mask.foreach_index([&](const int64_t i) { /* ... */ }); + +/* Work is parallelized when a grain size is passed in. */ +mask.foreach_index(GrainSize(512), [&](const int64_t i) { /* ... */ }); +``` diff --git a/docs/features/core/data_structures/virtual_array.md b/docs/features/core/data_structures/virtual_array.md index 8c3a9074..49fde6ff 100644 --- a/docs/features/core/data_structures/virtual_array.md +++ b/docs/features/core/data_structures/virtual_array.md @@ -17,7 +17,8 @@ VArray values = VArray::ForSingle(value, num_values); VArray values = VArray::ForSpan(int_span); /* Create a virtual array where the value at each index is twice the index. */ -VArray values = VArray::ForFunc(num_values, [](const int64_t i) { return i * 2; }); +VArray values = VArray::ForFunc( + num_values, [](const int64_t i) { return i * 2; }); /* Get the value at a specific index. */ int value = values[5]; -- 2.30.2 From e4b30a62de533e8f458856504f00a5e54b95759f Mon Sep 17 00:00:00 2001 From: Jacques Lucke Date: Mon, 19 Feb 2024 12:45:55 +0100 Subject: [PATCH 13/14] change navigation --- docs/features/core/{data_structures => blenlib}/any.md | 0 docs/features/core/{data_structures => blenlib}/bits.md | 0 docs/features/core/{data_structures => blenlib}/containers.md | 0 docs/features/core/{data_structures => blenlib}/cpp_type.md | 0 docs/features/core/{data_structures => blenlib}/disjoint_set.md | 0 docs/features/core/{data_structures => blenlib}/functions.md | 0 docs/features/core/{data_structures => blenlib}/index.md | 2 +- docs/features/core/{data_structures => blenlib}/index_mask.md | 0 docs/features/core/{data_structures => blenlib}/index_range.md | 0 docs/features/core/{data_structures => blenlib}/spans.md | 0 docs/features/core/{data_structures => blenlib}/strings.md | 0 .../features/core/{data_structures => blenlib}/virtual_array.md | 0 docs/features/core/navigation.md | 1 + 13 files changed, 2 insertions(+), 1 deletion(-) rename docs/features/core/{data_structures => blenlib}/any.md (100%) rename docs/features/core/{data_structures => blenlib}/bits.md (100%) rename docs/features/core/{data_structures => blenlib}/containers.md (100%) rename docs/features/core/{data_structures => blenlib}/cpp_type.md (100%) rename docs/features/core/{data_structures => blenlib}/disjoint_set.md (100%) rename docs/features/core/{data_structures => blenlib}/functions.md (100%) rename docs/features/core/{data_structures => blenlib}/index.md (91%) rename docs/features/core/{data_structures => blenlib}/index_mask.md (100%) rename docs/features/core/{data_structures => blenlib}/index_range.md (100%) rename docs/features/core/{data_structures => blenlib}/spans.md (100%) rename docs/features/core/{data_structures => blenlib}/strings.md (100%) rename docs/features/core/{data_structures => blenlib}/virtual_array.md (100%) diff --git a/docs/features/core/data_structures/any.md b/docs/features/core/blenlib/any.md similarity index 100% rename from docs/features/core/data_structures/any.md rename to docs/features/core/blenlib/any.md diff --git a/docs/features/core/data_structures/bits.md b/docs/features/core/blenlib/bits.md similarity index 100% rename from docs/features/core/data_structures/bits.md rename to docs/features/core/blenlib/bits.md diff --git a/docs/features/core/data_structures/containers.md b/docs/features/core/blenlib/containers.md similarity index 100% rename from docs/features/core/data_structures/containers.md rename to docs/features/core/blenlib/containers.md diff --git a/docs/features/core/data_structures/cpp_type.md b/docs/features/core/blenlib/cpp_type.md similarity index 100% rename from docs/features/core/data_structures/cpp_type.md rename to docs/features/core/blenlib/cpp_type.md diff --git a/docs/features/core/data_structures/disjoint_set.md b/docs/features/core/blenlib/disjoint_set.md similarity index 100% rename from docs/features/core/data_structures/disjoint_set.md rename to docs/features/core/blenlib/disjoint_set.md diff --git a/docs/features/core/data_structures/functions.md b/docs/features/core/blenlib/functions.md similarity index 100% rename from docs/features/core/data_structures/functions.md rename to docs/features/core/blenlib/functions.md diff --git a/docs/features/core/data_structures/index.md b/docs/features/core/blenlib/index.md similarity index 91% rename from docs/features/core/data_structures/index.md rename to docs/features/core/blenlib/index.md index 7995636b..1b36765a 100644 --- a/docs/features/core/data_structures/index.md +++ b/docs/features/core/blenlib/index.md @@ -1,3 +1,3 @@ -# Data Structures +# BLI: Blender Library Core low-level data structures used throughout Blender. The documentation here is supposed to give a birds eye view over the available data structures and when to use them. More details for all available methods can be found in the source code. diff --git a/docs/features/core/data_structures/index_mask.md b/docs/features/core/blenlib/index_mask.md similarity index 100% rename from docs/features/core/data_structures/index_mask.md rename to docs/features/core/blenlib/index_mask.md diff --git a/docs/features/core/data_structures/index_range.md b/docs/features/core/blenlib/index_range.md similarity index 100% rename from docs/features/core/data_structures/index_range.md rename to docs/features/core/blenlib/index_range.md diff --git a/docs/features/core/data_structures/spans.md b/docs/features/core/blenlib/spans.md similarity index 100% rename from docs/features/core/data_structures/spans.md rename to docs/features/core/blenlib/spans.md diff --git a/docs/features/core/data_structures/strings.md b/docs/features/core/blenlib/strings.md similarity index 100% rename from docs/features/core/data_structures/strings.md rename to docs/features/core/blenlib/strings.md diff --git a/docs/features/core/data_structures/virtual_array.md b/docs/features/core/blenlib/virtual_array.md similarity index 100% rename from docs/features/core/data_structures/virtual_array.md rename to docs/features/core/blenlib/virtual_array.md diff --git a/docs/features/core/navigation.md b/docs/features/core/navigation.md index 5dd3908f..e6920672 100644 --- a/docs/features/core/navigation.md +++ b/docs/features/core/navigation.md @@ -1 +1,2 @@ +- [BLI: Blender Library](blenlib/) - * -- 2.30.2 From 0fbf73725488926bd4cf05da32c00c030d2da624 Mon Sep 17 00:00:00 2001 From: Hans Goudey Date: Mon, 19 Feb 2024 12:08:39 -0500 Subject: [PATCH 14/14] Fixes, wording changes, a few additions --- docs/features/core/blenlib/any.md | 7 +++--- docs/features/core/blenlib/bits.md | 24 ++++++++++----------- docs/features/core/blenlib/containers.md | 23 +++++++++++--------- docs/features/core/blenlib/functions.md | 4 ++-- docs/features/core/blenlib/index_mask.md | 6 +++--- docs/features/core/blenlib/index_range.md | 2 +- docs/features/core/blenlib/spans.md | 2 +- docs/features/core/blenlib/strings.md | 2 +- docs/features/core/blenlib/virtual_array.md | 2 +- 9 files changed, 38 insertions(+), 34 deletions(-) diff --git a/docs/features/core/blenlib/any.md b/docs/features/core/blenlib/any.md index 7f024c0d..3a7464ca 100644 --- a/docs/features/core/blenlib/any.md +++ b/docs/features/core/blenlib/any.md @@ -1,10 +1,11 @@ # Any `blender::Any` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_any.hh)) is a type-safe container for single values of any copy-constructible type. It is similar to `std::any` but provides the following two additional features: + - It has adjustable inline buffer capacity and alignment. `std::any` typically has a small inline buffer but its size is not guaranteed. - It can store additional user-defined type information without increasing the `sizeof` the `Any` object. -If any of those features is required, it's benefitial to use `blender::Any`. Otherwise using `std::any` is fine as well. +If any of those features are required, it's benefitial to use `blender::Any`. Otherwise using `std::any` is fine as well. ```cpp /* Construct empty value. */ @@ -32,9 +33,9 @@ void *value = value.get(); value.reset(); ``` -## Store Additional Type Information +## Additional Type Information -One of the features of `blender::Any` is that it can store additional information for each type that is stored. This can be done with fairly low overhead, because `Any` does type erasure and has to store some type specific information anyway. +One of the features of `blender::Any` is that it can store additional information for each type that is stored. This can be done with fairly low overhead, because `Any` does type erasure and has to store some type-specific information anyway. In the example below, the `blender::Any` knows the size of the stored type. diff --git a/docs/features/core/blenlib/bits.md b/docs/features/core/blenlib/bits.md index 76b5be49..2c8ed046 100644 --- a/docs/features/core/blenlib/bits.md +++ b/docs/features/core/blenlib/bits.md @@ -1,16 +1,16 @@ # Bits -Sometimes it can be benefitial to work with bits directly instead of boolean values because they are very compact and many bits can be processed at the same time. This document shows some available utilities to work with dynamically sized bitsets. +Sometimes it can be benefitial to work with bits directly instead of boolean values because they are very compact and many bits can be processed at the same time. This document describes some of the utilities available for working with dynamically-sized bitsets. Before choosing to work with individual bits instead of bools, keep in mind that there are also downsides which may not be obvious at first. + - Writing to separate bits in the same int is not thread-safe. Therefore, an existing vector of - bool can't easily be replaced with a bit vector, if it is written to from multiple threads. + bool can't necessarily be replaced with a bit vector when it is written to from multiple threads. Read-only access from multiple threads is fine though. -- Writing individual elements is more expensive when the array is in cache already. That is - because changing a bit is always a read-modify-write operation on the int the bit resides in. -- Reading individual elements is more expensive when the array is in cache already. That is - because additional bit-wise operations have to be applied after the corresponding int is - read. +- Writing an individual element is more expensive when the array is in cache already, because + changing a bit is always a read-modify-write operation on the integer containing the bit. +- Reading an individual element is more expensive when the array is in cache already, because + additional bit-wise operations have to be applied after the corresponding int is read. ## BitVector @@ -46,9 +46,9 @@ Those are also the types returned when accessing a specific index in `BitVector` ## BitSpan -Just like it's not possible to reference a single bit with standard C++, it's also not possible to reference a span of bits. To do that, one can use `BitSpan` and `MutableBitSpan` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_bit_span.hh)). +Just like it's not possible to reference a single bit with standard C++, it's also not possible to reference a span of bits. Instead, one can use `BitSpan` and `MutableBitSpan` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_bit_span.hh)). -Additionally, there is also `BoundedBitSpan` and `MutableBoundedBitSpan`. Those are like normal bit spans but enforce specific constraints on the alignment of the span. These additional constraints allow bit spans to be processed more efficiently than in the most general constraint. For more details on the exact constraints, check the `is_bounded_span` function. +Additionally, there are also `BoundedBitSpan` and `MutableBoundedBitSpan`. Those are like normal bit spans but enforce specific constraints on the alignment of the span. These additional constraints allow bit spans to be processed more efficiently than in the most general case. For more details on the exact constraints, check the `is_bounded_span` function. It's generally recommended to work with bit spans that follow these constraints if possible for best performance. @@ -59,7 +59,7 @@ There are three core operations that can be performed on bit spans: 2. Check if any bit is set. 3. Iterate over all set bits. -`BLI_bit_span_ops.hh` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_bit_span_ops.hh)) offers utilities to these things. +`BLI_bit_span_ops.hh` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_bit_span_ops.hh)) offers utilities for these operations. ```cpp BitVector<> vec1(500); @@ -91,9 +91,9 @@ bits::mix_into_first_expr([](bits::BitInt result, `BitGroupVector` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_bit_group_vector.hh)) allows storing a fixed number of bits for each element. For example, this could be used to store a bit for each attribute for each vertex in a mesh. -In some sense, this data structure is also 2D bit vector, that can dynamically grow on one axis. +In some sense, this data structure is also 2D bit vector that can dynamically grow on one axis. -`BitGroupVector` is designed so that the individual bit groups all fullfill the requirements by bounded bit spans. As such, they can be processed efficiently. +`BitGroupVector` is designed so that the each bit group fullfills the requirements of bounded bit spans (`BoundedBitSpan`). As such, they can be processed efficiently. ```cpp /* Store a bit for each attribute for each vertex. */ diff --git a/docs/features/core/blenlib/containers.md b/docs/features/core/blenlib/containers.md index 107efd1b..bb17bc03 100644 --- a/docs/features/core/blenlib/containers.md +++ b/docs/features/core/blenlib/containers.md @@ -2,9 +2,11 @@ [Container](https://en.wikipedia.org/wiki/Container_(abstract_data_type)) data structures allow storing many elements of the same type. Different structures in this category allow for different access patterns. +Many of Blender's available containers have equivalents in the standard library. In most case's it's preferred to use the Blender container instead. + ## Vector -The `blender::Vector` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_vector.hh)) is the most important data structure. It stores values of the given type in a dynamically growing contiguous buffer. +The `blender::Vector` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_vector.hh)) is the most important container. It stores values of the given type in a dynamically growing contiguous buffer. ```cpp /* Create an empty vector. */ @@ -19,7 +21,7 @@ int value = values[0]; ## Array -`blender::Array` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_array.hh)) is very similar to `Vector`. The main difference is that it is not dynamically growing. Instead its size is usually only set once and stays the same for the rest of its life-time. It has a slightly lower memory footprint than `Vector`. Using an `Array` instead of `Vector` also indicates that the size is not expected to change. +`blender::Array` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_array.hh)) is very similar to `Vector`. The main difference is that it is does not grow dynamically. Instead its size is usually only set once and stays the same for the rest of its life-time. It has a slightly lower memory footprint than `Vector`. Using an `Array` instead of `Vector` also indicates that the size is not expected to change. Note that this is different from `std::array` for which the size has to be known at compile time. If the size is actually known at compile time, `std::array` should be used instead. @@ -42,7 +44,7 @@ int value = stack.peek(); int value = stack.pop(); ``` -A `Vector` can also be used as a `Stack` using the `Vector::append`, `Vector::last` and `Vector::pop_last` methods. This is benefitial if one also needs the ability to iterate over all elements that are currently in the stack. If that's not required, it's better to use `Stack` directly because of it's more purpose-designed methods and it allows pushing in O(1) (not just armortized as it does not require reallocating already pushed elements). +A `Vector` can also be used as a `Stack` using the `Vector::append`, `Vector::last` and `Vector::pop_last` methods. This is benefitial if one also needs the ability to iterate over all elements that are currently in the stack. Otherwise it's better to use `Stack` directly because of its more purpose-designed methods and because it allows pushing in O(1) time (not just armortized, since it does not require reallocating already pushed elements). ## Set @@ -72,7 +74,7 @@ bool is_contained = values.contains(3); Using `Set` with a custom type requires an equality operator and a [hash](#hashing) function. -While a `Vector` could also be used to mimic the behavior of a `Set`, it's generally much less efficient at that task. A `Set` uses a hash table internally which allows it to check for duplicates in constant time instead of having to compare the value with every previously added element. +While a `Vector` can also be used to mimic the behavior of a `Set`, it's generally much less efficient at that task. A `Set` uses a hash table internally which allows it to check for duplicates in constant time instead of having to compare the value with every previously added element. ## Map @@ -129,7 +131,7 @@ Using `Map` with a custom type as key requires an equality operator and a [hash] ## Vector Set -A `blender::VectorSet` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_vector_set.hh)) is a combination of a `Vector` and a `Set`. It can't contain duplicate values like a `Set` but the values stored in it are ordered based on insertion order (until elements are removed). Just like in a `Vector`, the values are also stored in a contiguous array which makes it easy to pass them to other functions that expect an array. +A `blender::VectorSet` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_vector_set.hh)) is a combination of a `Vector` and a `Set`. It can't contain duplicate values like a `Set` but the values stored in it are stored in insertion order (until elements are removed). Just like in a `Vector`, the values are also stored in a contiguous array which makes it easy to pass them to other functions that expect an array. ```cpp /* Construct empty vector-set. */ @@ -162,8 +164,9 @@ These are some concepts that apply to many of the container types. Most container types mentioned above (except `VectorSet` currently) have an inline buffer. For as long as the elements added to the container fit into the inline buffer, no additional allocation is made. This is important because allocations can be a performance bottleneck. Inline buffers are enabled by default in supported containers. It's generally recommended to use the default value, but there are cases when the inline buffer size should be set manually. -* When building a compact type which has a container that is usually empty, the inline buffer size could be set to 0. It should also be considered to just wrap the container in a `std::unique_ptr` in this case. -* When working in hot code that requires e.g. a `Vector`, the inline buffer size can be increased to make better use of stack memory and to avoid allocations in the majority of cases. + +- When building a compact type which has a container that is usually empty, the inline buffer size could be set to 0. It should also be considered to just wrap the container in a `std::unique_ptr` in this case. +- When working in hot code that requires e.g. a `Vector`, the inline buffer size can be increased to make better use of stack memory and to avoid allocations in the majority of cases. The inline buffer is is typically the first template parameter after the type. @@ -182,7 +185,7 @@ Using a larger online buffer obviously also increases the size of the type: `siz ### Hashing -Using custom types in a data structure that uses a hash table (like `Set`, `Map` and `VectorSet`) requires the implementation of the equality operator (`operator==`) and a `hash` function. +Using custom types in a data structure that uses a hash table (like `Set`, `Map`, and `VectorSet`) requires the implementation of the equality operator (`operator==`) and a `hash` function. ```cpp struct MyType { @@ -201,7 +204,7 @@ struct MyType { A potentially more convenient way to implement the equality operator could be to use `BLI_STRUCT_EQUALITY_OPERATORS_2`. -The `hash` function has to return the same value for two instances of the type that are considered equal. In theory, even returning a constant value fullfills that requirement and would be correct. However, having a hash function that always returns the same value makes using hash tables useless and degrades performance. +The `hash` function has to return the same value for two instances of the type that are considered equal. In theory, even returning a constant value fullfills that requirement and would be correct. However, a hash function that always returns the same value makes using hash tables useless and degrades performance. Simply calling `get_default_hash` on the data members that should impact the hash (which are the same ones which should impact equality) is usually good enough. When designing a custom hash function, it's recommended to put as much variation as possible into the lower bits. Those are used by the containers at first. However, if the low bits have too many collisions, the higher bits are taken into account automatically as well. @@ -213,6 +216,6 @@ Many core methods on the container data structures have multiple overloads. For Additionally, many methods have a variant with the `_as` suffix. Such methods allow passing in the parameter with a different type than what the container actually contains. For example, `Vector::append_as` can also be called with a `const char *` parameter, and not just `std::string`. The `std::string` is then constructed inplace. This avoids the need to construct it first and then to move it in the right place. This specific example is very similar to `std::vector::emplace_back`. -However, the `*_as` convention is a bit more general. For example, it allows calling `Set::add_as` or `Set::contains_as` to be called with a type that is not exactly the one which is stored. This avoids the construction of stored type in many cases. +However, the `*_as` convention is a bit more general. For example, it allows calling `Set::add_as` or `Set::contains_as` to be called with a type that is not exactly the one which is stored. This avoids the construction of the stored type in many cases. Other libraries sometimes support this as well, without the additional `_as` suffix, but that also leads to more complex error messages in the common cases. diff --git a/docs/features/core/blenlib/functions.md b/docs/features/core/blenlib/functions.md index 9f4bbc12..fc6b1718 100644 --- a/docs/features/core/blenlib/functions.md +++ b/docs/features/core/blenlib/functions.md @@ -1,6 +1,6 @@ # Functions -There are many ways to store a reference to a function. This document gives an overview over those and gives recommendations for when to use which approach. +There are many ways to store a reference to a function. This document gives an overview of them and gives recommendations for when to use each approach. 1. Pass function pointer and user data (as void *) separately: - The only method that is compatible with C interfaces. @@ -25,7 +25,7 @@ There are many ways to store a reference to a function. This document gives an o - Works well with all callables. - It's a non-owning reference, so it *cannot* be stored safely in general. -The following diagram helps to decide which approach to use when building an API where the user has to pass in a function. +The following diagram helps to decide which approach to use when building an API where the user has to pass a function. ```mermaid flowchart TD diff --git a/docs/features/core/blenlib/index_mask.md b/docs/features/core/blenlib/index_mask.md index 44d78030..025d8711 100644 --- a/docs/features/core/blenlib/index_mask.md +++ b/docs/features/core/blenlib/index_mask.md @@ -1,10 +1,10 @@ # Index Mask -An `IndexMask` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_index_mask.hh)) is a sequence of unique and sorted indices. It's commonly used when a subset of elements in an array has to be processed. This is sometimes called [existential processing](https://www.dataorienteddesign.com/dodbook/node4.html) and is often better than having e.g. a bool for every element that has to be checked in every inner loop to determine if it has to be processed. +An `IndexMask` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_index_mask.hh)) is a sequence of unique and sorted indices. It's commonly used when a subset of elements in an array have to be processed. This is sometimes called [existential processing](https://www.dataorienteddesign.com/dodbook/node4.html) and is often better than having e.g. a bool for every element that has to be checked in every inner loop to determine if it has to be processed. -Semantically, an `IndexMask` is very similar to a simple `Vector` with unique and sorted indices. However, due to the implementation details of `IndexMask`, it can be significantly more efficient than the `Vector`. +Semantically, an `IndexMask` is very similar to a simple `Vector` with unique and sorted indices. However, due to the implementation details of `IndexMask`, it is significantly more efficient than the `Vector`. -An `IndexMask` does not own the memory it references. Typically, the referenced data is either statically allocated or is owned by an `IndexMaskMemory`. +An `IndexMask` does not own the memory it references. Typically the referenced data is either statically allocated or is owned by an `IndexMaskMemory`. ```cpp /* Owner of some dynamically allocated memory in one or more index masks. */ diff --git a/docs/features/core/blenlib/index_range.md b/docs/features/core/blenlib/index_range.md index 2368ad81..68feb957 100644 --- a/docs/features/core/blenlib/index_range.md +++ b/docs/features/core/blenlib/index_range.md @@ -2,7 +2,7 @@ An `IndexRange` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_index_range.hh)) represents a set of non-negative consecutive indices. It's stored as just the start index and the number of indices. Since it's small, it should generally be passed by value. -The most common usage of `IndexRange` is to loop over indices. This is better than a c-style index loop, because it reduces the likelyhood of mixing up variables and allows the current index to be const. It's also just more convenient. +The most common usage of `IndexRange` is to loop over indices. This is better than a c-style index loop because it reduces the likelyhood of mixing up variables and allows the current index to be const. It's also just more convenient. ```cpp /* Iterate over the indices from 0 to 9. */ diff --git a/docs/features/core/blenlib/spans.md b/docs/features/core/blenlib/spans.md index f607641b..3883d6b8 100644 --- a/docs/features/core/blenlib/spans.md +++ b/docs/features/core/blenlib/spans.md @@ -2,6 +2,6 @@ A `blender::Span` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_span.hh)) references an array that is owned by someone else. It is just a pointer and a size. Since it is so small, it should generally be passed by value. -Using `Span` is the main way to pass multiple elements into a function and should be prefered over e.g. `const Vector &` because it gives the caller more flexibility. +`Span` is the main way to pass multiple elements into a function and should be prefered over e.g. `const Vector &` because it gives the caller more flexibility. The memory directly referenced by the span is considered to be `const`. This is different from `std::span` where constness is not the default. When non-constness is required, a `MutableSpan` can be used. This also makes the intention more clear. diff --git a/docs/features/core/blenlib/strings.md b/docs/features/core/blenlib/strings.md index 4418eed5..2fcbb345 100644 --- a/docs/features/core/blenlib/strings.md +++ b/docs/features/core/blenlib/strings.md @@ -1,3 +1,3 @@ # Strings -Blender usually stores strings as `std::string`. If strings are passed around without transfering ownership `blender::StringRef` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_string_ref.hh)) should be used. `StringRef` is a non-owning slice of a string. It's just a pointer and a size and should generally be passed around by value. If a string with null-termination is required, `StringRefNull` should be used instead. +Blender usually stores strings as `std::string`. If strings are passed around without transfering ownership, `blender::StringRef` ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_string_ref.hh)) should be used. `StringRef` is a non-owning slice of a string. It's just a pointer and a size and should generally be passed around by value. If a string with null-termination is required, `StringRefNull` should be used instead. diff --git a/docs/features/core/blenlib/virtual_array.md b/docs/features/core/blenlib/virtual_array.md index 49fde6ff..55d255e6 100644 --- a/docs/features/core/blenlib/virtual_array.md +++ b/docs/features/core/blenlib/virtual_array.md @@ -2,7 +2,7 @@ A virtual array ([source](https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenlib/BLI_virtual_array.hh)) is a data structure that behaves similarly to an array, but its elements are accessed through virtual methods. This improves the decoupling of a function from its callers, because it does not have to know exactly how the data is laid out in memory, or if it is stored in memory at all. It could just as well be computed on the fly. -Taking a virtual array as parameter instead of a more specific non-virtual type has some tradeoffs. Access to individual elements of the individual elements is slower due to function call overhead. On the other hand, potential callers don't have to convert the data into the specific format required for the function. This can be a costly conversion if only few of the elements are accessed in the end. +Taking a virtual array as parameter instead of a more specific non-virtual type has some tradeoffs. Access to individual elements is slower due to function call overhead. On the other hand, potential callers don't have to convert the data into the specific format required for the function. That can be a costly conversion if only few of the elements are accessed in the end. Functions taking a virtual array as input can still optimize for different data layouts. For example, they can check if the array references contiguous memory internally or if it is the same value for all indices. Whether it is worth optimizing for different data layouts in a function has to be decided on a case by case basis. One should always do some benchmarking to see if the increased compile time and binary size is worth it. -- 2.30.2