@amelief I'll continue working on this, hope you don't mind!
I think this can be made more parallelizable by doing the intersection tests up-front: There is essentially one intersection test per point. The expand_cutter_segment
loop may get started…
Profiler indicates a lot of time spent in receive_or_steal_task
(65%). I guess doing threading in the outer loop is a more difficult because of how CutterSegments
gets extended and needs to be…
Quite a complex feature, and it works better than the GPv2 version! Just some minor nitpicks and some possible performance improvements. Not critical, but I'd like to confirm with someone who understands thread behavior better than me.
It would be easier to understand if this last part for cyclic curves is moved out of the expand_cutter_segment
function. Currently it looks at first glance like the function is recursive. Doing both calls for the non-cyclic and the cyclic part outside of the function is easier to understand i think.
IndexRange
can be used instead of a custom range class. That also comes with a bunch of useful functions, e.g. contains
to check the range in point_is_in_segment
.
Second opinion would be welcome, but i think calling parallel_for
here could have significant overhead. This function is called for every point, and then starts a threaded loop over all the points, and waits before moving to the next point. If my intuition is correct it would be faster to do the threading in the outer loop (stroke_cutter_find_and_remove_segments
) and use a simple for
loop here.
I think it would be easier and more efficient to just have a per-point bool array to mark points in segments. We're not interested in which segment a point is part of, so just setting the flag when a point gets added to any segment should be sufficient.
Tested it and it feels quite nice.
I noticed that GPv2 cutter only deletes one segment per stroke, while this implementation removes anything within the lasso (generally, some corner cases…