Previously, there was a fixed grain size for all multi-functions. That was
not sufficient because some functions could benefit a lot from smaller
grain sizes.
This refactors adds a new `MultiFunction::call_auto` method which has the
same effect as just calling `MultiFunction::call` but additionally figures
out how to execute the specific multi-function efficiently. It determines
a good grain size and decides whether the mask indices should be shifted
or not.
Most multi-function evaluations benefit from this, but medium sized work
loads (1000 - 50000 elements) benefit from it the most. Especially when
expensive multi-functions (e.g. noise) is involved. This is because for
smaller work loads, threading is rarely used and for larger work loads
threading worked fine before already.
With this patch, multi-functions can specify execution hints, that allow
the caller to execute it most efficiently. These execution hints still
have to be added to more functions.
Some performance measurements of a field evaluation involving noise and
math nodes, ordered by the number of elements being evaluated:
```
1,000,000: 133 ms -> 120 ms
100,000: 30 ms -> 18 ms
10,000: 20 ms -> 2.7 ms
1,000: 4 ms -> 0.5 ms
100: 0.5 ms -> 0.4 ms
```
45 lines
1.7 KiB
C++
45 lines
1.7 KiB
C++
/*
|
|
* This program is free software; you can redistribute it and/or
|
|
* modify it under the terms of the GNU General Public License
|
|
* as published by the Free Software Foundation; either version 2
|
|
* of the License, or (at your option) any later version.
|
|
*
|
|
* This program is distributed in the hope that it will be useful,
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
* GNU General Public License for more details.
|
|
*
|
|
* You should have received a copy of the GNU General Public License
|
|
* along with this program; if not, write to the Free Software Foundation,
|
|
* Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
|
*/
|
|
|
|
#include "FN_multi_function_params.hh"
|
|
|
|
namespace blender::fn {
|
|
|
|
GMutableSpan MFParams::ensure_dummy_single_output(int data_index)
|
|
{
|
|
/* Lock because we are actually modifying #builder_ and it may be used by multiple threads. */
|
|
std::lock_guard lock{builder_->mutex_};
|
|
|
|
for (const std::pair<int, GMutableSpan> &items : builder_->dummy_output_spans_) {
|
|
if (items.first == data_index) {
|
|
return items.second;
|
|
}
|
|
}
|
|
|
|
const CPPType &type = builder_->mutable_spans_[data_index].type();
|
|
void *buffer = builder_->scope_.linear_allocator().allocate(
|
|
builder_->min_array_size_ * type.size(), type.alignment());
|
|
if (!type.is_trivially_destructible()) {
|
|
builder_->scope_.add_destruct_call(
|
|
[&type, buffer, mask = builder_->mask_]() { type.destruct_indices(buffer, mask); });
|
|
}
|
|
const GMutableSpan span{type, buffer, builder_->min_array_size_};
|
|
builder_->dummy_output_spans_.append({data_index, span});
|
|
return span;
|
|
}
|
|
|
|
} // namespace blender::fn
|