Fix #99887: Multires Unsubdivide freezes Blender with hidden geometry #121086

Merged
Raul Fernandez Hernandez merged 4 commits from farsthary/blender:multires_unsubdivide_freezes into main 2024-05-07 18:05:04 +02:00

Previously in

edge_y = edge_step(current_vertex_y, edge_y, &current_vertex_y);
while (!BM_elem_flag_test(current_vertex_y, BM_ELEM_TAG))

With hidden faces this code enters in an infinite loop by returning the same null
edge in edge_step() and the same current_vertex_y causing Blender to crash.

The current fix solves the infinite loop and provides a solution when faces are hidden.
This causes the hidden faces to be merged into one for unsubdivided per original dev's implementation.

At minimum the crash is identified and a solution proposed.
Attached is a video of the fix

Previously in ``` edge_y = edge_step(current_vertex_y, edge_y, &current_vertex_y); while (!BM_elem_flag_test(current_vertex_y, BM_ELEM_TAG)) ``` With hidden faces this code enters in an infinite loop by returning the same null edge in edge_step() and the same current_vertex_y causing Blender to crash. The current fix solves the infinite loop and provides a solution when faces are hidden. This causes the hidden faces to be merged into one for unsubdivided per original dev's implementation. At minimum the crash is identified and a solution proposed. Attached is a video of the fix
Raul Fernandez Hernandez added 2 commits 2024-04-25 18:42:15 +02:00
Iliya Katushenock added this to the Modeling project 2024-04-25 18:51:31 +02:00
Iliya Katushenock added the
Module
Sculpt, Paint & Texture
label 2024-04-25 18:51:36 +02:00
Raul Fernandez Hernandez requested review from Sergey Sharybin 2024-04-25 19:04:19 +02:00
Raul Fernandez Hernandez requested review from Hans Goudey 2024-04-25 19:04:49 +02:00
Raul Fernandez Hernandez self-assigned this 2024-04-25 19:04:58 +02:00
Raul Fernandez Hernandez requested review from Iliya Katushenock 2024-04-25 19:05:18 +02:00
Iliya Katushenock refused to review 2024-04-25 19:12:14 +02:00
Raul Fernandez Hernandez was unassigned by Iliya Katushenock 2024-04-25 19:12:18 +02:00
Member

The description was a bit confusing because the block of code isn't the previous state, it's the new state in the PR.

But the fix makes sense. To make things clear though, maybe it's best to remove the null check from edge_step. If I'm reading things right, the argument should always be non-null now.

The description was a bit confusing because the block of code isn't the previous state, it's the new state in the PR. But the fix makes sense. To make things clear though, maybe it's best to remove the null check from `edge_step`. If I'm reading things right, the argument should always be non-null now.
Author
Member

The description was a bit confusing because the block of code isn't the previous state, it's the new state in the PR.

But the fix makes sense. To make things clear though, maybe it's best to remove the null check from edge_step. If I'm reading things right, the argument should always be non-null now.

Thanks! The null check is necessary in the while loop and after since the function edge_step() will can return null and make the edge_y or edge_x null.
I've tried what you suggest and it crash later because of that. As is now works correctly.

> The description was a bit confusing because the block of code isn't the previous state, it's the new state in the PR. > > But the fix makes sense. To make things clear though, maybe it's best to remove the null check from `edge_step`. If I'm reading things right, the argument should always be non-null now. Thanks! The null check is necessary in the while loop and after since the function edge_step() will can return null and make the edge_y or edge_x null. I've tried what you suggest and it crash later because of that. As is now works correctly.
Member

I know it can return null, but there is a null check on the next iteration of the while loop that covers this. Unless I'm really missing something, It just seems wrong/weird to check for null on the condition of the while loop, then check again inside the loop itself. Where did it crash later on?

I know it can return null, but there is a null check on the next iteration of the while loop that covers this. Unless I'm really missing something, It just seems wrong/weird to check for null on the condition of the while loop, then check again inside the loop itself. Where did it crash later on?
Author
Member

I know it can return null, but there is a null check on the next iteration of the while loop that covers this. Unless I'm really missing something, It just seems wrong/weird to check for null on the condition of the while loop, then check again inside the loop itself. Where did it crash later on?

What I meant is that on the second null check,
only if edge_* is not null we should modify the returning values of r_corner_*
The while loop can run and modify internally the edge_* and make it null at the end of the steps
(that's why it was hanging on forever)
The first null check is to break the loop, and the second is to ensure no r_corner_x or r_corner_y is modified in such case.

> I know it can return null, but there is a null check on the next iteration of the while loop that covers this. Unless I'm really missing something, It just seems wrong/weird to check for null on the condition of the while loop, then check again inside the loop itself. Where did it crash later on? What I meant is that on the **second null check**, only if edge_* is **not** null we should modify the returning values of r_corner_* The while loop **can** run and modify internally the edge_* and make it null at the end of the steps (that's why it was hanging on forever) The first null check is to break the loop, and the second is to ensure no r_corner_x or r_corner_y is modified in such case.
Author
Member

@HooglyBoogly without the second null check it crashes on Debug only (on Release the BLI_abort() seems to be successfully ignored but still is a crash) it results in a vertex index to be -1 or something like that.
Adding the null check run successfully in both debug and release.
No worth saving a null check.

@HooglyBoogly without the second null check it crashes on Debug only (on Release the BLI_abort() seems to be successfully ignored but still is a crash) it results in a vertex index to be -1 or something like that. Adding the null check run successfully in both debug and release. No worth saving a null check.

I am a bit confused about the null-pointer discussion. Just to be sure we're talking about the same thing, this is the null-pointer check we are talking about, right?

diff --git a/source/blender/blenkernel/intern/multires_unsubdivide.cc b/source/blender/blenkernel/intern/multires_unsubdivide.cc
index fd27c74a85e..20babb06565 100644
--- a/source/blender/blenkernel/intern/multires_unsubdivide.cc
+++ b/source/blender/blenkernel/intern/multires_unsubdivide.cc
@@ -491,10 +491,6 @@ static BMEdge *edge_step(BMVert *v, BMEdge *edge, BMVert **r_next_vertex)
 {
   BMIter iter;
   BMEdge *test_edge;
-  if (edge == nullptr) {
-    (*r_next_vertex) = v;
-    return edge;
-  }
   (*r_next_vertex) = BM_edge_other_vert(edge, v);
   BM_ITER_ELEM (test_edge, &iter, (*r_next_vertex), BM_EDGES_OF_VERT) {
     if (!BM_edge_share_quad_check(test_edge, edge)) {

Is a bit hard to predict all consequences, as it is used from outside of the loops changed here. But i can not reproduce the crash reported by Raul when he removed the check.

Surely for the final state of code having this block removed is probably clear, but for getting there with a bug fixed, I suggest doing such cleanup as a separate commit after actual functional changes are done. It will make review and bisecting (if something goes wrong) easier.

For this PR it is much more important to tackle the multires_unsubdivide_get_grid_corners_on_base_mesh itself. With the proposed change it is possible that it leaves r_corner_{x, y} uninitialized, and the code which calls it expects both of the corners to be initialized:

      BMVert *corner_x, *corner_y;
      multires_unsubdivide_get_grid_corners_on_base_mesh(l->f, l->e, &corner_x, &corner_y);

This actually might be the reason of the crash Raul is getting? It kind of aligns with the code later const int corner_x_index = orig_to_base_vmap[BM_elem_index_get(corner_x)]; which will crash if corner is not initialized.

I am not fully sure sure what is expected to happen in the case multires_unsubdivide_get_grid_corners_on_base_mesh did not find one of the vertices. Let me know if you want extra eyes for figuring this out!

I am a bit confused about the null-pointer discussion. Just to be sure we're talking about the same thing, this is the null-pointer check we are talking about, right? ```Diff diff --git a/source/blender/blenkernel/intern/multires_unsubdivide.cc b/source/blender/blenkernel/intern/multires_unsubdivide.cc index fd27c74a85e..20babb06565 100644 --- a/source/blender/blenkernel/intern/multires_unsubdivide.cc +++ b/source/blender/blenkernel/intern/multires_unsubdivide.cc @@ -491,10 +491,6 @@ static BMEdge *edge_step(BMVert *v, BMEdge *edge, BMVert **r_next_vertex) { BMIter iter; BMEdge *test_edge; - if (edge == nullptr) { - (*r_next_vertex) = v; - return edge; - } (*r_next_vertex) = BM_edge_other_vert(edge, v); BM_ITER_ELEM (test_edge, &iter, (*r_next_vertex), BM_EDGES_OF_VERT) { if (!BM_edge_share_quad_check(test_edge, edge)) { ``` Is a bit hard to predict all consequences, as it is used from outside of the loops changed here. But i can not reproduce the crash reported by Raul when he removed the check. Surely for the final state of code having this block removed is probably clear, but for getting there with a bug fixed, I suggest doing such cleanup as a separate commit after actual functional changes are done. It will make review and bisecting (if something goes wrong) easier. For this PR it is much more important to tackle the `multires_unsubdivide_get_grid_corners_on_base_mesh` itself. With the proposed change it is possible that it leaves `r_corner_{x, y}` uninitialized, and the code which calls it expects both of the corners to be initialized: ```Cpp BMVert *corner_x, *corner_y; multires_unsubdivide_get_grid_corners_on_base_mesh(l->f, l->e, &corner_x, &corner_y); ``` This actually might be the reason of the crash Raul is getting? It kind of aligns with the code later `const int corner_x_index = orig_to_base_vmap[BM_elem_index_get(corner_x)];` which will crash if corner is not initialized. I am not fully sure sure what is expected to happen in the case `multires_unsubdivide_get_grid_corners_on_base_mesh` did not find one of the vertices. Let me know if you want extra eyes for figuring this out!
Author
Member

@Sergey I can 100% repro the Debug build crash with this file.
Steps:
1-compile blender Debug
2-open the file and press unsubdivide

Xcode will trigger a break at

@Sergey I can 100% repro the Debug build crash with this file. Steps: 1-compile blender Debug 2-open the file and press unsubdivide Xcode will trigger a break at

@farsthary Did you check if it's due to the unintiialized corner_x/corner_y after the call multires_unsubdivide_get_grid_corners_on_base_mesh() ?

@farsthary Did you check if it's due to the unintiialized `corner_x`/`corner_y` after the call `multires_unsubdivide_get_grid_corners_on_base_mesh()` ?
Author
Member

@farsthary Did you check if it's due to the unintiialized corner_x/corner_y after the call multires_unsubdivide_get_grid_corners_on_base_mesh() ?

With your proposed changes, removing the null check inside the function static BMEdge *edge_step(BMVert *v, BMEdge *edge, BMVert **r_next_vertex), we get a worse crash down the road because BM_edge_other_vert(edge, v) do not expect NULL edges.

I still vote for the original PR solution which is simple and gets the job done, 4 LOC.

> @farsthary Did you check if it's due to the unintiialized `corner_x`/`corner_y` after the call `multires_unsubdivide_get_grid_corners_on_base_mesh()` ? With your proposed changes, removing the null check inside the function `static BMEdge *edge_step(BMVert *v, BMEdge *edge, BMVert **r_next_vertex),` we get a worse crash down the road because `BM_edge_other_vert(edge, v)` do not expect NULL edges. I still vote for the original PR solution which is simple and gets the job done, 4 LOC.
Author
Member

Or better yet, either

  • Unsubdivide completely ignores hidden face state and process every face as visible,
    (because creating random N-gons will mess up the modeling pipeline later)

  • or Unsubdivide throws a warning message about hidden faces and do not proceed.

Or better yet, either - Unsubdivide completely ignores hidden face state and process every face as visible, (because creating random N-gons will mess up the modeling pipeline later) - or Unsubdivide throws a warning message about hidden faces and do not proceed.
Author
Member

But I think those should be subject of a separated PR, this one is better to keep it short and small about fixing the infinite loop.

But I think those should be subject of a separated PR, this one is better to keep it short and small about fixing the infinite loop.
Raul Fernandez Hernandez requested review from Jacques Lucke 2024-04-27 02:38:11 +02:00

@farsthary My suggestion was indeed to leave the null-pointer check removal to the separate PR. However, I think there is an use-of-uninitialized variable with this PR.

There is this code in the multires_unsubdivide_extract_grids:

      /* For each loop, get the two vertices that should map to the l+1 and l-1 vertices in the
       * base mesh of the face of grid that is going to be extracted. */
      BMVert *corner_x, *corner_y;
      multires_unsubdivide_get_grid_corners_on_base_mesh(l->f, l->e, &corner_x, &corner_y);

      /* Map the two obtained vertices to the base mesh. */
      const int corner_x_index = orig_to_base_vmap[BM_elem_index_get(corner_x)];
      const int corner_y_index = orig_to_base_vmap[BM_elem_index_get(corner_y)];

It clearly expects corner_x and corner_y to be initialized by multires_unsubdivide_get_grid_corners_on_base_mesh(). But with the proposed patch it is not guaranteed.

I am not really sure why ASAB builds do not catch any issues. This could be due to the fact that all allocations are done via memory pools, and maybe we do not poison it for Apple Clang or something. Or, maybe, it is because of the code being used from a loop, and those variables are always initialized to their previous value the issue does not manifest itself that much.

In any case, we should not rely on such behavior of uninitialized variables and handle it in a more explicit manner.

@farsthary My suggestion was indeed to leave the null-pointer check removal to the separate PR. However, I think there is an use-of-uninitialized variable with this PR. There is this code in the `multires_unsubdivide_extract_grids`: ```Cpp /* For each loop, get the two vertices that should map to the l+1 and l-1 vertices in the * base mesh of the face of grid that is going to be extracted. */ BMVert *corner_x, *corner_y; multires_unsubdivide_get_grid_corners_on_base_mesh(l->f, l->e, &corner_x, &corner_y); /* Map the two obtained vertices to the base mesh. */ const int corner_x_index = orig_to_base_vmap[BM_elem_index_get(corner_x)]; const int corner_y_index = orig_to_base_vmap[BM_elem_index_get(corner_y)]; ``` It clearly expects `corner_x` and `corner_y` to be initialized by `multires_unsubdivide_get_grid_corners_on_base_mesh()`. But with the proposed patch it is not guaranteed. I am not really sure why ASAB builds do not catch any issues. This could be due to the fact that all allocations are done via memory pools, and maybe we do not poison it for Apple Clang or something. Or, maybe, it is because of the code being used from a loop, and those variables are always initialized to their previous value the issue does not manifest itself that much. In any case, we should not rely on such behavior of uninitialized variables and handle it in a more explicit manner.
Raul Fernandez Hernandez changed title from Fix #99887: Multires Unsubdivide freezes Blender with hidden geoemtry to WIP: Fix #99887: Multires Unsubdivide freezes Blender with hidden geoemtry 2024-04-30 05:38:37 +02:00
Raul Fernandez Hernandez changed title from WIP: Fix #99887: Multires Unsubdivide freezes Blender with hidden geoemtry to WIP: Fix #99887: Multires Unsubdivide freezes Blender with hidden geometry 2024-04-30 05:38:49 +02:00
Raul Fernandez Hernandez added 1 commit 2024-04-30 22:08:00 +02:00
Hans Goudey requested changes 2024-04-30 22:10:11 +02:00
Dismissed
@ -1043,6 +1053,8 @@ static void multires_unsubdivide_extract_grids(MultiresUnsubdivideContext *conte
* base mesh of the face of grid that is going to be extracted. */
BMVert *corner_x, *corner_y;
multires_unsubdivide_get_grid_corners_on_base_mesh(l->f, l->e, &corner_x, &corner_y);
if (!corner_x || !corner_y)
Member

The style guide mentions always using braces for nested scopes: https://developer.blender.org/docs/handbook/guidelines/c_cpp/#braces

The style guide mentions always using braces for nested scopes: https://developer.blender.org/docs/handbook/guidelines/c_cpp/#braces
Author
Member

wow that's a fast review 😅

wow that's a fast review 😅
farsthary marked this conversation as resolved
Member

Typically "WIP" in the PR title means it isn't ready for review. I'm guessing that's not he case here?

Typically "WIP" in the PR title means it isn't ready for review. I'm guessing that's not he case here?
Author
Member

I'll remove the WIP in the next push

I'll remove the WIP in the next push
Raul Fernandez Hernandez added 1 commit 2024-04-30 22:14:39 +02:00
Raul Fernandez Hernandez changed title from WIP: Fix #99887: Multires Unsubdivide freezes Blender with hidden geometry to Fix #99887: Multires Unsubdivide freezes Blender with hidden geometry 2024-04-30 22:15:08 +02:00
Author
Member

Ready for a final review, hopefully we can get this infinite loop hang crossed over since that's a severe bug out in the open for users.

Ready for a final review, hopefully we can get this infinite loop hang crossed over since that's a severe bug out in the open for users.

The code seems to make sense to me now, thanks for the update :)
Did you ever hear back from Julien or Daniel what the expected behavior is?

The code seems to make sense to me now, thanks for the update :) Did you ever hear back from Julien or Daniel what the expected behavior is?
Author
Member

Thanks @Sergey
I haven't heard from them yet. Worst case if the expected behavior is to ignore hidden faces or halt the unsubdivide operator with a warning we can make it a new PR.
After 1 or 2 approvals I can proceed to merge this.

Thanks @Sergey I haven't heard from them yet. Worst case if the expected behavior is to ignore hidden faces or halt the unsubdivide operator with a warning we can make it a new PR. After 1 or 2 approvals I can proceed to merge this.
Sergey Sharybin approved these changes 2024-05-07 17:33:40 +02:00
Sergey Sharybin left a comment
Owner

I think I forgot to click the approve button, and share some further insights.

Basically, as I've mentioned the code looks fine now, in a sense that now the algorithm handles corner cases much better. Would be cool to agree with artists from the module about what the ideal solution would be, but until then I'm happy with a solution which avoids crashes/dead-locks.Even if the result is not what artists expect, it is much better than potentially loosing data.

As a follow-up to this PR, we should be able to remove the null-pointer check from the edge_step, as we've discussed previously. Is more like a reminder to not forget about it.

I think I forgot to click the approve button, and share some further insights. Basically, as I've mentioned the code looks fine now, in a sense that now the algorithm handles corner cases much better. Would be cool to agree with artists from the module about what the ideal solution would be, but until then I'm happy with a solution which avoids crashes/dead-locks.Even if the result is not what artists expect, it is much better than potentially loosing data. As a follow-up to this PR, we should be able to remove the null-pointer check from the `edge_step`, as we've discussed previously. Is more like a reminder to not forget about it.
Hans Goudey approved these changes 2024-05-07 17:44:24 +02:00
Raul Fernandez Hernandez merged commit 00e2f55239 into main 2024-05-07 18:05:04 +02:00
Raul Fernandez Hernandez deleted branch multires_unsubdivide_freezes 2024-05-07 18:05:07 +02:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#121086
No description provided.