Geometry Nodes: Rewrite Scale Elements node #115142
No reviewers
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#115142
Loading…
Reference in New Issue
No description provided.
Delete Branch "mod_moder/blender:tmp_speedup_scale_elements"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Rewrite of Scale Elements. Main changes related with removing unnecessary
abstractions (like structures of fields). Next, by using grouping approach,
all data is represented as spans. This provide ability to unify code for
different domains. Using of general utils like IndexMask, Group processing
and array utils provides much more parallelism and better memory usage.
In result, this refactoring result in 4-10 average speed improvement in
attached benchmark file with different probability and scale of elements.
Measurements of speed improvement for Face domain (ms):
Before:
After:
Average speed up:
Measurements of speed improvement for Edge domain (ms):
Before:
After:
Average speed up:
WIP: Geometry Nodes: Scale Elements node speedupto Geometry Nodes: Scale Elements node speedupGeometry Nodes: Scale Elements node speedupto Geometry Nodes: Rewrite Scale Elements node@blender-bot build
Looks good! More straightforward than I expected actually.
@ -104,0 +132,4 @@
});
}
static Array<int> reverse_indices_in_groups(const Span<int> group_indices,
I think it would make sense to move these to a separate implementation and header file as part of this commit, de-duplicating the implementation in
mesh_mapping.cc
at the same time. That would make me feel more confident about the final state of the code anyway.Such separate file is implemented in #111081, for now, i just copy-paste part of code to just avoid locking by general bli solution.
@ -177,0 +302,4 @@
const float total = elem_island.size();
BLI_assert(total > 0.0f);
const float scale = accumulate<float>(scale_varray, elem_island) / total;
const float3 center = accumulate<float3>(center_varray, elem_island) / total;
This method of finding the mean is prone to overflow. Maybe use a
mean
function that adds each element multiplied by the1 / size
factor.Right now i just reimplement original algorithm, but this function can be changed to be more safe, and be moved to something like
array_utils
/attribute_math
. and also to be reused in attribute statistic node too.@ -290,0 +371,4 @@
/* If result indices is for gathered array, map than back into global indices. */
if (face_mask.size() != mesh.faces_num) {
parallel_transform<int>(r_item_indices, 4098, [&](const int pos) { return face_mask[pos]; });
[]
onIndexMask
here seems slow possibly? I'd think there would be another way to structure thisIt's good to use O(1) way if that is possible. But in such case, there is just 2 way: binary search ( O(N * log2(size / ~16000)) ) and array of indices ( O(N + size) ). Second one is prefer, but that is not memory free...
Generally, index mas segmentation should be as few as possible (all random indices should be grouped in large segments), so real complexity only depend on selection~. I think it is okay to keep such small complexity if that is necessery in such place and thinking about something like
VArray::FromIndexMask
andVArray::FromIndexMaskRevers
to support corner cases (all indices is one span, all indices is just one range, ...) as just kind ofdevertualize
closures.@ -344,3 +393,1 @@
edge_selection.foreach_index([&](const int edge_index) {
const int2 &edge = edges[edge_index];
disjoint_set.join(edge[0], edge[1]);
edge_mask.foreach_index_optimized<int>(GrainSize(4098), [&](const int edge_i) {
4096
is the power of two you're looking for :DSo many such typos....
@ -449,0 +487,4 @@
const GroupedSpan<int> item_islands(item_offsets.as_span(), item_indices);
const GroupedSpan<int> vert_islands(vert_offsets.as_span(), vert_indices);
const VArray<float> &scale_varray = evaluator.get_evaluated<float>(0);
Remove
&
, it's fine but a bit misleading, since this is binding the reference to a temporaryOverall structure makes sense! Just a few smaller comments
@ -101,0 +211,4 @@
for (const int i : indices.slice(range)) {
value += values[i];
}
return join_accumulators({value / float(indices.size()), 1}, other);
Shouldn't this be
range.size()
?@ -207,2 +333,2 @@
for (const int island_index : range) {
const ElementIsland &island = islands[island_index];
AtomicDisjointSet disjoint_set(vert_mask.size());
const GroupedSpan<int> face_verts(mesh.face_offsets(), mesh.corner_verts());
mesh.face_offsets()
->mesh.faces()
@ -394,1 +438,3 @@
scale_vertex_islands_on_axis(mesh, island, params, get_edge_verts);
/* If result indices is for gathered array, map than back into global indices. */
if (edge_mask.size() != mesh.edges_num) {
parallel_transform<int>(r_item_indices, 4096, [&](const int pos) { return edge_mask[pos]; });
Like I mentioned on the other PR, I'm skeptical of this random access to
IndexMask
. Feels like structuring the code not to require that would be betterTemporal array just to fill-and-read once will change ~5% time of this function, but require allocation of array with size of some domains.
Benchmark was in: #114194 (comment)
Yeah, I still think it's safer to choose the option with the better computational complexity. It's more likely that its performance won't change for the worse in the future.
Guess the remaining thing to do is switch to using
to_indices
for random index mask access?@ -112,0 +236,4 @@
Mesh &mesh)
{
MutableSpan<float3> positions = mesh.vert_positions_for_write();
threading::parallel_for(
Is this supposed to be
parallel_for_weighted
?Oh...
I mainly how much of this code is more general purpose. We have many places where we split indices into many groups. This patch seems to implement a new approach which is fine but maybe this approach should also be used elsewhere? Have you identified trade-offs between the different approaches we use already?
I have general solution for this, see #111081. I just copy-paste part of code to do not be locked by more general util.
Part of #111081 is to use this in each place where this can be used.
Well, PR will be such slow as current main if i'll use code from
782d096905/source/blender/blenkernel/intern/mesh_mapping.cc (L366)
(this is the reason why i copy more general implementation which is support groups with different sizes).I can reproduce the speedup. Thanks for providing the benchmark file:
I'm going to commit this. Next I think it would make sense to move all the index grouping utilities to a new
BLI
file. How aboutBLI_index_grouping.hh
? We can talk about that more in chat too.