Geometry Nodes: refactor multi-threading in field evaluation

Previously, there was a fixed grain size for all multi-functions. That was
not sufficient because some functions could benefit a lot from smaller
grain sizes.

This refactors adds a new `MultiFunction::call_auto` method which has the
same effect as just calling `MultiFunction::call` but additionally figures
out how to execute the specific multi-function efficiently. It determines
a good grain size and decides whether the mask indices should be shifted
or not.

Most multi-function evaluations benefit from this, but medium sized work
loads (1000 - 50000 elements) benefit from it the most. Especially when
expensive multi-functions (e.g. noise) is involved. This is because for
smaller work loads, threading is rarely used and for larger work loads
threading worked fine before already.

With this patch, multi-functions can specify execution hints, that allow
the caller to execute it most efficiently. These execution hints still
have to be added to more functions.

Some performance measurements of a field evaluation involving noise and
math nodes, ordered by the number of elements being evaluated:
```
1,000,000: 133 ms   -> 120 ms
  100,000:  30 ms   ->  18 ms
   10,000:  20 ms   ->   2.7 ms
    1,000:   4 ms   ->   0.5 ms
      100:   0.5 ms ->   0.4 ms
```
This commit is contained in:
2021-11-26 11:05:47 +01:00
parent 004172de38
commit 658fd8df0b
13 changed files with 264 additions and 163 deletions

View File

@@ -60,6 +60,7 @@ class MultiFunction {
{
}
void call_auto(IndexMask mask, MFParams params, MFContext context) const;
virtual void call(IndexMask mask, MFParams params, MFContext context) const = 0;
virtual uint64_t hash() const
@@ -110,6 +111,31 @@ class MultiFunction {
return *signature_ref_;
}
/**
* Information about how the multi-function behaves that help a caller to execute it efficiently.
*/
struct ExecutionHints {
/**
* Suggested minimum workload under which multi-threading does not really help.
* This should be lowered when the multi-function is doing something computationally expensive.
*/
int64_t min_grain_size = 10000;
/**
* Indicates that the multi-function will allocate an array large enough to hold all indices
* passed in as mask. This tells the caller that it would be preferable to pass in smaller
* indices. Also maybe the full mask should be split up into smaller segments to decrease peak
* memory usage.
*/
bool allocates_array = false;
/**
* Tells the caller that every execution takes about the same time. This helps making a more
* educated guess about a good grain size.
*/
bool uniform_execution_time = true;
};
ExecutionHints execution_hints() const;
protected:
/* Make the function use the given signature. This should be called once in the constructor of
* child classes. No copy of the signature is made, so the caller has to make sure that the
@@ -121,6 +147,8 @@ class MultiFunction {
BLI_assert(signature != nullptr);
signature_ref_ = signature;
}
virtual ExecutionHints get_execution_hints() const;
};
inline MFParamsBuilder::MFParamsBuilder(const MultiFunction &fn, int64_t mask_size)