Fix versioning old 2.6 node groups, causing dangling link pointers #111704
No reviewers
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#111704
Loading…
Reference in New Issue
No description provided.
Delete Branch "LukasTonne/blender:fix-versioning-26-node-groups"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
In 2.6 the old method of using bNodeSocket lists in bNodeTree directly
as group sockets was replaced with new group input/output nodes. This
required versioning to create those input/output nodes and then redirect
links to the new node sockets. Because creating nodes relies heavily on
node typeinfo this versioning was done in the
_after_linking
sectionof the 2.6 versioning code, running after all other versioning
(including for much newer versions!) has already happended.
While typinfo is available at that point, doing such late versioning
causes severe problems when the data structure changes, as is the case
with the recent node panels patch (#111348). The new node group
interface also has versioning code for 4.0, but this runs before the
_after_linking
code for 2.6! Versioning for node panels expectssockets in bNodeTree to not have any links pointing at them, but this is
not true for old 2.6 files which have not yet been fully versioned at
that point, because of the late versioning stage. Subsequently 2.6
_after_linking
code crashes when trying to modify node links withdangling pointers.
The solution here is to move the old versioning code out of the
after_linking
stage to restore the expected versioning chain. Thisrequires creating nodes and node sockets without any typeinfo, but
luckily we only need to create simple known group input/output nodes
which don't have much complicated behavior.
Generally looks good. I think it would be good if we would construct these nodes directly as dna structs in versioning code more often instead of calling code from blenkernel. That always works fine for now, but is brittle in the long term..
Added some inline comments.
@ -2173,0 +2266,4 @@
/* Convert the previously used ntree->inputs/ntree->outputs lists to interface nodes.
* Pre 2.56.2 node trees automatically have all unlinked sockets exposed already,
* see do_versions_after_linking_250.
This comment hints at a potential problem, because now
do_versions_after_linking_250
also runs after this versioning code now. I don't even know what node systems we had pre 2.5 tbh.I've started adding a simple compatibility test, loading files from 2.5, 2.6, 3.6, etc (trying to cover major transitions in node code). Loading a simple 2.5 file already fails an assert in unrelated versioning code, because a node gets added to the runtime
nodes_by_id
set twice.cf12d11de6/source/blender/blenloader/intern/versioning_250.cc (L1899)
I hope there aren't too many other issues and i can actually test the
after_linking
parts. This should have been done years ago, but better late than never ...The offending
nodeUniqueName
was added byab4926bcff
but it re-generates the identifier for every socket in a node and adds the node tonodes_by_id
as many times.I find so many broken things with 2.5 nodes versioning, might be preferable to complete this PR for 2.6 compatibility and continue fixing 2.5 nodes separately.
Getting back to the original question: "Does this
after_linking
code still work after regular versioning to 4.0?"Yes! And in this case running in
after_linking
is necessary, because the socket identifier update looks at the node tree pointers of group nodes (i.e. other ID data blocks). It reads frombNodeSocket::own_index
(deprecated by 2.5.2) andbNodeSocket::identifier
(stable, existing socket identifiers never change). So this should be pretty safe to keep inafter_linking
.The question was mostly whether it's a problem that now the order of the versioning is reversed, since the after-linking code runs after the code that is moved in this patch (but it used to run before).
In #111800 i'm adding a compatibility regression test for node groups. The 3.6 case already required some minor fixes, i want to add a 2.5 test if feasible.
Took me a long time to unravel the timeline of node changes, here's what i reconstructed so far:
TL;DR: We need to update after_linking code to find group socket matches and preserve external links.
Before 2.56.2 all the unconnected internal sockets were exposed. To find the internal socket for a group node input or output one would search for the internal socket whose
own_index
matches the externalto_index
.Starting with 2.56.2 and before 2.66.3 this behavior changed and now node groups would only show an explicit set of inputs and outputs (stored in
bNodeTree
inputs
/outputs
). This required versioning totonode==nullptr
orfromnode==nullptr
which indicate a connection to the node tree interface.code link
All of this happens inside the same
bNodeTree
as part of regular versioning. Theafter_linking
versioning is needed for updating theown_index
of group nodes (code link):own_index
is still used to identify sockets at this point (replaced with identifiers later).to_index
gets deprecated, the socket in the node tree and its counterpart on a group node instance should simply have the sameown_index
. In order to version this, however, each group node needs to inspect its linked node group, which is not available during regular versioning.Note: Socket identifier strings are already generated by 2.56.2 versioning, but not sure if they would have been used for anything. This might be some apocryphal versioning code added later?
In 2.66.2+ there was another set of changes:
own_index
mapping is replaced with unique socket identifier strings code linknullptrs
allowed) code linkThe 2.66.3 versioning code was also running in the
after_linking
stage, but as demonstrated here this is only for the convenience ofbNodeSocketType
callbacks and not really necessary.What could happen if we execute the 2.56.2
after_linking
code after all the 4.0 versioning? Inside the node group all the unconnected sockets have been exposed first asntree->inputs
/ntree-outputs
with nullptr links, those have then been replaced with group input/output nodes and regular links, and then thentree->inputs
/ntree-outputs
have been replaced bytree_interface.ui_items
. Even after all this the deprecatedown_index
of the sockets in the tree that should be exposed to mimick 2.5 behavior still exist, which would make it possible to find internal sockets and update the external socket identifiers.However, the versioning code also expects there to be special links with node nullptrs now - so the 2.56.2
after_linking
versioning code already includes changes from 2.66.3 in order to function correctly. The special links have been replaced by versioning code, so this part is now ineffective and never runs.The consequence is that socket identifiers for node groups are not fully updated and some links get broken. This is because the identifiers of existing sockets are first generated by 2.66.2 versioning code (code link) but they are only based on names and unique suffixes. Sockets without a matching group tree socket are replaced later and any links discarded.
We need to update
do_versions_after_linking_250
to take this into account and search for matching internal socket like so:own_index
matches the group socket'sto_index
:There is a lingering question of what might happen if some intermediate versioning depends on correct group socket identifiers? But i think this is as far as we can reasonably support such ancient files.
Thanks for looking into this. I think it's ok when we are not able to load these ancient files perfectly. Would be nice, but it is what it is. If people really rely on those files, they may have to load and save the files in a more recent version. It's good to know why things are not working anyway. Also, it's great that you are adding some tests for this!
Do you happen to know when the node systems that we use nowadays have been introduced?
Depends on what you mean by "systems that we use nowadays". Oldest commit to the
nodes
folder is from 2007. From then on the nodes system has been a Ship of Theseus: 2.5 saw the change to explicit node group sockets, then group input/output nodes and python nodes in 2.6, and other parts of the system have been modified and swapped out over the years.I can't figure this out (in reasonable time and with acceptable sanity loss). I "fixed" the 2.5 after_linking code, but it still does not work because socket identifiers in the bNodeTree are changing so much that links cannot be restored from the 2.5 file.
do_versions_nodetree_customnodes
in versioning_260 to aFooBar.001
formatFooBar_001
formatFooBar.001
format in the after_linking codeFooBar_001
format (not sure where from)@ -2173,0 +2369,4 @@
FOREACH_NODETREE_END;
}
if (!MAIN_VERSION_FILE_ATLEAST(bmain, 280, 60)) {
This versioning looks like it's at the wrong place (see version number)
Correct. I considered moving it to the 280 file, and the first thing i saw at the bottom of the that file was a bit of versioning code for ... 3.0 :P
https://projects.blender.org/blender/blender/src/branch/main/source/blender/blenloader/intern/versioning_280.cc#L6300-L6324
I find it extremely difficult to reason about why something works or doesn't in nodes versioning, so i just left it next to the other bit for now ...
I see. It's not super horrible if it works. The code between here and where this versioning code would usually live is unlikely to change in major ways.
@blender-bot build
@blender-bot build