Use hash for Collection.gobject lookup, speedup object linking #104553
No reviewers
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#104553
Loading…
Reference in New Issue
No description provided.
Delete Branch "ideasman42:pr-collection-object-hash"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Add a hash for faster look-ups on collection->gobject,
This avoids a full list lookup for every object added via Python's
CollectionObject.link as well as linking via BKE_collection_object_add_*
functions.
While the speedup is non-linear, linking & unlinking 100k objects from
Python is about 50x faster. Although unlinking all objects in order
(a best-case for linked lists) is approximately the same speed.
Initial tests (see
simple_collection_speed_test.py
): 100,000 objects, which is significantly faster linking & unlinking.Before
After
Notes:
This patch was written as a possible alternative to adding a CollectionObjects.link_multiple method, see: D16260, with the intent of resolving the issue in a more general way that benefits object linking everywhere, not just from the Python API.
Keeping this hash valid at all times might not be trivial, although after some fixes to the initial patch this is working as it should.
If maintaining a valid hash is too error prone, we could support this functionality but explicitly create and destroy the gobject_hash so it doesn't need to be maintained across complex operation such as some object ID's being swapped out (if this proves overly complicated to support).
While this speeds up object duplication, there are other bottlenecks which make the overall speedup less significant overall.
Thanks for addressing the feedback, it is now much nicer!
Still some confusion about the consistency check. See the inlined comment.
@ -1072,2 +1118,3 @@
CollectionObject *cob = BLI_findptr(&collection->gobject, ob, offsetof(CollectionObject, ob));
collection_gobject_hash_ensure(collection);
BLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID(collection);
Why do we need
BLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID
after thecollection_gobject_hash_ensure
? And why is it needed here but not in the other use of thecollection_gobject_hash_ensure
some lines below?I would actually move this check as part of
collection_gobject_hash_ensure
itself? Would be much less verbose, and make more sense to me?Makes sense, moved into
collection_gobject_hash_ensure
.Agree with @Sergey, also found some more points to address, as noted below.
Regarding the usage of this new hash in all type of Collection IDs (evaluated, non-main etc.), I don't mind, so if people think it may be useful for non-Main IDs as well am fine with it. It sure does make the code simpler.
@ -1074,0 +1119,4 @@
collection_gobject_hash_ensure(collection);
BLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID(collection);
CollectionObject *cob = BLI_ghash_lookup(collection->runtime.gobject_hash, ob);
BLI_assert(cob == BLI_findptr(&collection->gobject, ob, offsetof(CollectionObject, ob)));
These asserts remove almost all benefit of using the mapping for debug builds... and are totally redundant if
BLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID
is actually checking things? Same for the others below.So not sure we should keep them here.
Agree, at the time of writing I was eager to flag up any issues early, but this adds too much overhead, even for debug builds, removing the
BLI_assert
.@ -1104,0 +1153,4 @@
BLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID(collection);
collection_gobject_hash_ensure(collection);
CollectionObject *cob = BLI_ghash_lookup(collection->runtime.gobject_hash, ob);
Couldn't this use instead
cob = BLI_ghash_popkey(collection->runtime.gobject_hash, ob, NULL);
? Then there is no need for an extra lookup below in the call tocollection_gobject_hash_remove_object(collection, ob);
Yep, good idea.
@ -1581,0 +1670,4 @@
}
/**
* Update the collections object hash the old reference to `ob_old` uses `cob->ob` as the key.
This is not a proper English sentence... ;)
@ -1581,0 +1685,4 @@
collection_gobject_hash_remove_object(collection, ob_old);
}
/* This may be set to NULL. */
if (ob_new) {
This may leak in case
ob_new
is NULL butcob
is not, depending on calling code...You could make it clear in doc that this function takes 'ownership' of the given
cob
if any, and may free it if it cannot use it.But TBH, since this is addressing such a specific use-case, and with a very specific behavior, I would not even make a separate function, and would just add the code directly into
collection_foreach_id
. Or at the very least move this function above its caller, and rename it to something likecollection_foreach_id_gobject_hash_update
. I have a hard time seeing this very specific behavior used anywhere else than in the remapping use-case.It's awkward either way, Sergey requested to move this into a utility function, while it's nice to split out in some ways - it's rather spesific functionality which is only used in one place, renamed the function and removed the
ob_new
argument socob->ob
is always used.Following up on discussion on D17016, where I noted.
Sergey replied:
After digging into the issue, here is a bug-report #104798 with a proposed fix !104801.
As far as I can see that issue can be addressed separately from this patch, since unnecessarily copying the collection is an issue irrespective of changes made in this patch (although creating the hash does add some overhead).
9a1a473ffc
to6432f129a9
6432f129a9
toec811ae7b6
All issues should be addressed now.
@blender-bot build
Found some more possible tweaks and improvements...
Also wondering if it would be worth checking on
BKE_collection_has_object_recursive
and similar, now that we have a hash for objects in a collection. Not in the scope of this patch though.@ -939,3 +979,3 @@
return false;
}
BLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID(collection);
Would say this should also use
collection_gobject_hash_ensure
? While the overhead of initialization may be significant, this is often called in a collection manipulation context, where other calls (like object adding) will ensure the hash anyway... And then code is simpler, call toBLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID
can be removed, etc.@ -1062,6 +1105,8 @@ static Collection *collection_parent_editable_find_recursive(const ViewLayer *vi
static bool collection_object_add(
Main *bmain, Collection *collection, Object *ob, int flag, const bool add_us)
{
BLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID(collection);
This call can be removed (see comment in
collection_gobject_hash_ensure
).@ -1072,2 +1117,3 @@
CollectionObject *cob = BLI_findptr(&collection->gobject, ob, offsetof(CollectionObject, ob));
collection_gobject_hash_ensure(collection);
CollectionObject *cob = BLI_ghash_lookup(collection->runtime.gobject_hash, ob);
You can save a lookup here by using
BLI_ghash_ensure_p
instead... Since you know you'll always end up with a validcob
for the givenob
key in that function.@ -1101,3 +1149,3 @@
const bool free_us)
{
CollectionObject *cob = BLI_findptr(&collection->gobject, ob, offsetof(CollectionObject, ob));
BLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID(collection);
This call can be removed (see comment in
collection_gobject_hash_ensure
).@ -1222,2 +1273,2 @@
CollectionObject *cob = BLI_findptr(
&collection->gobject, ob_old, offsetof(CollectionObject, ob));
BLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID(collection);
const bool use_hash_exists = collection->runtime.gobject_hash != NULL;
Not sure why we still have this actually?
Since we agreed to allow this hash for all collections, then we can as well rather call
collection_gobject_hash_ensure
here, remove the call toBLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID
, and also useBLI_ghash_ensure_p
to only perform one lookup.With you on the first two issues, but don't think
BLI_ghash_ensure_p
helps here as the old pointer used as as a key needs to be removed, the new one needs to be inserted again.@ -1581,0 +1648,4 @@
static void collection_gobject_hash_ensure(Collection *collection)
{
if (collection->runtime.gobject_hash) {
return;
Should also call
BLI_ASSERT_COLLECION_GOBJECT_HASH_IS_VALID
here.@ -153,1 +177,4 @@
BKE_LIB_FOREACHID_PROCESS_IDSUPER(data, cob->ob, IDWALK_CB_USER);
if (collection->runtime.gobject_hash && (cob_ob_old != cob->ob)) {
I would move the checks into the
collection_gobject_hash_update_object
. At leats thecob_ob_old != cob->ob
.@ -519,2 +550,4 @@
}
/* This is being deleted, no need to handle each item.
* NOTE: While it might seem an advantage to use the hash instead of the list-lookup
Something seems to be missing here. The
collection_object_remove
ensures that the hash exists (the first thing it does is thecollection_gobject_hash_ensure(collection)
).That's bad. I've split out the remove function so low level code that already knows the
CollectionObject
doesn't need to do a lookup or ensure the hash exists.@ -1581,0 +1681,4 @@
if (ob_old) {
collection_gobject_hash_remove_object(collection, ob_old);
}
/* This may be set to NULL. */
This comment does not reall bring information. The fact that it may be NULL is clear from the NULL-pointer check.
The more interesting for a reader information would be to know why
CollectionObject
can be pointing to NULL. Intuitively those are malformed and can/should simply be removed. If it is allowed to be NULL, an explanation is needed (together with a comment in thetypedef struct CollectionObject
, and a lot of checks all over the code base are to be added.The confusing part is that the
cob->ob
is checked for NULL in SOME cases, inconsistently.P.S. Generally, avoid over-commenting, as it actually hurts readability.
Yep, expanded the comment to note this isn't simply a paranoid null check.
@blender-bot build linux
36300def5e
tobc06586a34
@ -1102,0 +1145,4 @@
* A version of #collection_object_remove that does not handle `collection->runtime.gobject_hash`,
* Either the caller must have removed the object from the hash or the hash may be NULL.
*
* \note in rare cases `cob->ob != ob`, so both arguments need to be passed in.
This is something very surprising to me. I am not sure why this is the case, but it definitely asks to be written down in the comment here.
From quick look around the
collection_object_remove_no_gobject_hash
is either called with ob=cob->ob, and in other cases is called with a result of ghash lookup. The latter one is done aftercollection_gobject_hash_ensure
which seems to assert thatcob->ob
is the same asob
.In general, always document such cases for the future reader. It saves a lot of time (for both current review, and future you and future developers looking into the code).
This must have been a use-after-free mistake, double checked with the
ob
argument removed and the tests now pass.@blender-bot build
Updated description, this patch no longer causes unlinking in-order to be 18% slower, it's now about the same speed.
Committed
ea97bb1641
.Pull request closed