Creating and removing many objects very quickly causes a crash #84397

Closed
opened 2 years ago by oweissbarth · 51 comments

System Information
Operating system: Linux-5.10.4-arch2-1-x86_64-with-arch 64 Bits
Graphics card: GeForce GTX 1080 Ti/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 455.45.01

Blender Version
Broken: b71eb3a105
Worked: 82645ff739

Short description of error
Creating and removing many objects very quickly causes a crash. I created a .blend file to illustrate the issue. The script in the file implements a simple operator that adds a specified (operator property) amount of objects to the active collection.
Changing the number of objects sometimes causes a crash. I found the crash to be more likely the more objects are added. For the default count of 800 objects it crashes reliably.

Exact steps for others to reproduce the error

  • Run the script in the attached .blend
  • Run the causecrash Operator
  • In the operator properties drag the count property to the left to reduce the number of objects
  • Crash

Things I tested

  • --debug --debug-depsgraph-no-threads --threads 1 -> Still crashes
  • Legacy Undo -> Does not crash

I bisected the history and identified b71eb3a105 as the first bad commit.
The crash happens because subdiv_ccg is 0x1

- 0  0x000000000355b98d in BKE_subdiv_ccg_destroy (subdiv_ccg=0x1) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/subdiv_ccg.c:626
- 1  0x00000000034759be in BKE_mesh_runtime_clear_geometry (mesh=0x7fffcb4f8e38) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/mesh_runtime.c:235
- 2  0x0000000003475212 in BKE_mesh_runtime_clear_cache (mesh=0x7fffcb4f8e38) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/mesh_runtime.c:87
- 3  0x0000000003461aa3 in mesh_free_data (id=0x7fffcb4f8e38) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/mesh.c:158
- 4  0x0000000003441af9 in BKE_libblock_free_datablock (id=0x7fffcb4f8e38, UNUSED_flag=0) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/lib_id_delete.c:81
- 5  0x000000000cd317e4 in blender::deg::deg_free_copy_on_write_datablock (id_cow=0x7fffcb4f8e38)
    at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/eval/deg_eval_copy_on_write.cc:1073
#6  0x000000000cd4a6f6 in blender::deg::DepsgraphNodeBuilder::~DepsgraphNodeBuilder (this=0x7fffccb16d00, __in_chrg=<optimized out>)
    at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/builder/deg_builder_nodes.cc:146
#7  0x000000000cd4a79c in blender::deg::DepsgraphNodeBuilder::~DepsgraphNodeBuilder (this=0x7fffccb16d00, __in_chrg=<optimized out>)
    at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/builder/deg_builder_nodes.cc:151
- 8  0x000000000cd2ae4e in std::default_delete<blender::deg::DepsgraphNodeBuilder>::operator() (this=0x7fffffffd610, __ptr=0x7fffccb16d00) at /usr/include/c++/10.2.0/bits/unique_ptr.h:85
- 9  0x000000000cd2abba in std::unique_ptr<blender::deg::DepsgraphNodeBuilder, std::default_delete<blender::deg::DepsgraphNodeBuilder> >::~unique_ptr (this=0x7fffffffd610, 
    __in_chrg=<optimized out>) at /usr/include/c++/10.2.0/bits/unique_ptr.h:361
#10 0x000000000cd2a83f in blender::deg::AbstractBuilderPipeline::build_step_nodes (this=0x7fffffffd680)
    at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/builder/pipeline.cc:74
- 11 0x000000000cd2a650 in blender::deg::AbstractBuilderPipeline::build (this=0x7fffffffd680) at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/builder/pipeline.cc:55
- 12 0x000000000ccfe1ee in DEG_graph_build_from_view_layer (graph=0x7ffff352b838) at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/depsgraph_build.cc:228
- 13 0x000000000ccfe5b9 in DEG_graph_relations_update (graph=0x7ffff352b838) at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/depsgraph_build.cc:281
- 14 0x00000000035351fe in scene_graph_update_tagged (depsgraph=0x7ffff352b838, bmain=0x7fffccc73838, only_if_tagged=false)
    at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/scene.c:2603
- 15 0x00000000035352de in BKE_scene_graph_update_tagged (depsgraph=0x7ffff352b838, bmain=0x7fffccc73838) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/scene.c:2649
- 16 0x00000000039b0e76 in wm_event_do_depsgraph (C=0x7ffff7061ef8, is_after_open_file=false) at /home/oliver/code/blender/blender/source/blender/windowmanager/intern/wm_event_system.c:364
- 17 0x00000000039b0f78 in wm_event_do_refresh_wm_and_depsgraph (C=0x7ffff7061ef8) at /home/oliver/code/blender/blender/source/blender/windowmanager/intern/wm_event_system.c:389
- 18 0x00000000039b1838 in wm_event_do_notifiers (C=0x7ffff7061ef8) at /home/oliver/code/blender/blender/source/blender/windowmanager/intern/wm_event_system.c:571
- 19 0x00000000039ac826 in WM_main (C=0x7ffff7061ef8) at /home/oliver/code/blender/blender/source/blender/windowmanager/intern/wm.c:638
- 20 0x00000000033c17e1 in main (argc=7, argv=0x7fffffffdb78) at /home/oliver/code/blender/blender/source/creator/creator.c:522

{F9551145}{F9551152}many_object_update_crash.crash.txt

**System Information** Operating system: Linux-5.10.4-arch2-1-x86_64-with-arch 64 Bits Graphics card: GeForce GTX 1080 Ti/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 455.45.01 **Blender Version** Broken: b71eb3a105 Worked: 82645ff739 **Short description of error** Creating and removing many objects very quickly causes a crash. I created a .blend file to illustrate the issue. The script in the file implements a simple operator that adds a specified (operator property) amount of objects to the active collection. Changing the number of objects sometimes causes a crash. I found the crash to be more likely the more objects are added. For the default count of 800 objects it crashes reliably. **Exact steps for others to reproduce the error** - Run the script in the attached .blend - Run the causecrash Operator - In the operator properties drag the count property to the left to reduce the number of objects - Crash **Things I tested** - --debug --debug-depsgraph-no-threads --threads 1 -> Still crashes - Legacy Undo -> Does not crash I bisected the history and identified b71eb3a105 as the first bad commit. The crash happens because **subdiv_ccg is 0x1** ``` - 0 0x000000000355b98d in BKE_subdiv_ccg_destroy (subdiv_ccg=0x1) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/subdiv_ccg.c:626 - 1 0x00000000034759be in BKE_mesh_runtime_clear_geometry (mesh=0x7fffcb4f8e38) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/mesh_runtime.c:235 - 2 0x0000000003475212 in BKE_mesh_runtime_clear_cache (mesh=0x7fffcb4f8e38) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/mesh_runtime.c:87 - 3 0x0000000003461aa3 in mesh_free_data (id=0x7fffcb4f8e38) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/mesh.c:158 - 4 0x0000000003441af9 in BKE_libblock_free_datablock (id=0x7fffcb4f8e38, UNUSED_flag=0) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/lib_id_delete.c:81 - 5 0x000000000cd317e4 in blender::deg::deg_free_copy_on_write_datablock (id_cow=0x7fffcb4f8e38) at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/eval/deg_eval_copy_on_write.cc:1073 #6 0x000000000cd4a6f6 in blender::deg::DepsgraphNodeBuilder::~DepsgraphNodeBuilder (this=0x7fffccb16d00, __in_chrg=<optimized out>) at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/builder/deg_builder_nodes.cc:146 #7 0x000000000cd4a79c in blender::deg::DepsgraphNodeBuilder::~DepsgraphNodeBuilder (this=0x7fffccb16d00, __in_chrg=<optimized out>) at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/builder/deg_builder_nodes.cc:151 - 8 0x000000000cd2ae4e in std::default_delete<blender::deg::DepsgraphNodeBuilder>::operator() (this=0x7fffffffd610, __ptr=0x7fffccb16d00) at /usr/include/c++/10.2.0/bits/unique_ptr.h:85 - 9 0x000000000cd2abba in std::unique_ptr<blender::deg::DepsgraphNodeBuilder, std::default_delete<blender::deg::DepsgraphNodeBuilder> >::~unique_ptr (this=0x7fffffffd610, __in_chrg=<optimized out>) at /usr/include/c++/10.2.0/bits/unique_ptr.h:361 #10 0x000000000cd2a83f in blender::deg::AbstractBuilderPipeline::build_step_nodes (this=0x7fffffffd680) at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/builder/pipeline.cc:74 - 11 0x000000000cd2a650 in blender::deg::AbstractBuilderPipeline::build (this=0x7fffffffd680) at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/builder/pipeline.cc:55 - 12 0x000000000ccfe1ee in DEG_graph_build_from_view_layer (graph=0x7ffff352b838) at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/depsgraph_build.cc:228 - 13 0x000000000ccfe5b9 in DEG_graph_relations_update (graph=0x7ffff352b838) at /home/oliver/code/blender/blender/source/blender/depsgraph/intern/depsgraph_build.cc:281 - 14 0x00000000035351fe in scene_graph_update_tagged (depsgraph=0x7ffff352b838, bmain=0x7fffccc73838, only_if_tagged=false) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/scene.c:2603 - 15 0x00000000035352de in BKE_scene_graph_update_tagged (depsgraph=0x7ffff352b838, bmain=0x7fffccc73838) at /home/oliver/code/blender/blender/source/blender/blenkernel/intern/scene.c:2649 - 16 0x00000000039b0e76 in wm_event_do_depsgraph (C=0x7ffff7061ef8, is_after_open_file=false) at /home/oliver/code/blender/blender/source/blender/windowmanager/intern/wm_event_system.c:364 - 17 0x00000000039b0f78 in wm_event_do_refresh_wm_and_depsgraph (C=0x7ffff7061ef8) at /home/oliver/code/blender/blender/source/blender/windowmanager/intern/wm_event_system.c:389 - 18 0x00000000039b1838 in wm_event_do_notifiers (C=0x7ffff7061ef8) at /home/oliver/code/blender/blender/source/blender/windowmanager/intern/wm_event_system.c:571 - 19 0x00000000039ac826 in WM_main (C=0x7ffff7061ef8) at /home/oliver/code/blender/blender/source/blender/windowmanager/intern/wm.c:638 - 20 0x00000000033c17e1 in main (argc=7, argv=0x7fffffffdb78) at /home/oliver/code/blender/blender/source/creator/creator.c:522 ``` {[F9551145](https://archive.blender.org/developer/F9551145/many_object_update_crash.blend)}{[F9551152](https://archive.blender.org/developer/F9551152/blender_debug_output.txt)}[many_object_update_crash.crash.txt](https://archive.blender.org/developer/F9551151/many_object_update_crash.crash.txt)
Poster

Added subscriber: @oweissbarth

Added subscriber: @oweissbarth
JulianEisel was assigned by oweissbarth 2 years ago
rjg commented 2 years ago
Collaborator

Added subscriber: @rjg

Added subscriber: @rjg
rjg commented 2 years ago
Collaborator

Changed status from 'Needs Triage' to: 'Confirmed'

Changed status from 'Needs Triage' to: 'Confirmed'
rjg commented 2 years ago
Collaborator

I'm not certain if the commit actually introduced the bug or only made an underlying problem apparent in the undo system or dependency graph (e.g. similar to #80203 which could only be reproduced on macOS). I can reproduce the crash on Windows.

I'm not certain if the commit actually introduced the bug or only made an underlying problem apparent in the undo system or dependency graph (e.g. similar to #80203 which could only be reproduced on macOS). I can reproduce the crash on Windows.
rjg commented 2 years ago
Collaborator

@oweissbarth Are you able to reproduce this in a debug build with ASAN on Linux, because I haven't been able to?

@oweissbarth Are you able to reproduce this in a debug build with ASAN on Linux, because I haven't been able to?
Poster

I can confirm that it does not crash with ASAN on Linux.

I can confirm that it does **not** crash with ASAN on Linux.
Collaborator

Added subscriber: @LazyDodo

Added subscriber: @LazyDodo
Collaborator

looks like heap corruption to MSVC but the issue goes away as soon as you add any kind of heap validation, so it's a race-y kind of corruption? neat!

Unhandled exception at 0x00007FFBE222F0F9 (ntdll.dll) in blender.exe: 0xC0000374: A heap has been corrupted (parameters: 0x00007FFBE22987F0).

my stack is radically different from the one in the opening post though.

 	ntdll.dll!RtlReportCriticalFailure()	Unknown
 	ntdll.dll!RtlpHeapHandleError()	Unknown
 	ntdll.dll!RtlpHpHeapHandleError()	Unknown
 	ntdll.dll!RtlpLogHeapFailure()	Unknown
 	ntdll.dll!RtlpLowFragHeapAllocFromContext()	Unknown
 	ntdll.dll!RtlpAllocateHeapInternal()	Unknown
 	ucrtbase.dll!_malloc_base()	Unknown
>	blender.exe!MEM_lockfree_mallocN(unsigned __int64 len, const unsigned char * str) Line 276	C

 	blender.exe!get_bhead(FileData * fd) Line 837	C
 	[Inline Frame] blender.exe!blo_bhead_next(FileData *) Line 914	C
 	[Inline Frame] blender.exe!read_file_dna(FileData *) Line 1028	C
 	blender.exe!blo_decode_and_check(FileData * fd, ReportList * reports) Line 1286	C
 	blender.exe!BLO_read_from_memfile(Main * oldmain, const unsigned char * filename, MemFile * memfile, const BlendFileReadParams * params, ReportList * reports) Line 421	C
 	blender.exe!BKE_blendfile_read_from_memfile(bContext * C, MemFile * memfile, const BlendFileReadParams * params, ReportList * reports) Line 512	C
 	blender.exe!BKE_memfile_undo_decode(MemFileUndoData * mfu, const int undo_direction, const bool use_old_bmain_data, bContext * C) Line 91	C
 	blender.exe!memfile_undosys_step_decode(bContext * C, Main * bmain, UndoStep * us_p, int undo_direction, bool UNUSED_is_final) Line 197	C
 	blender.exe!undosys_step_decode(bContext * C, Main * bmain, UndoStack * ustack, UndoStep * us, int dir, bool is_final) Line 214	C
 	blender.exe!BKE_undosys_step_undo_with_data_ex(UndoStack * ustack, bContext * C, UndoStep * us, bool use_skip) Line 707	C
 	blender.exe!ed_undo_step_impl(bContext * C, int step, const unsigned char * undoname, int undo_index, ReportList * reports) Line 245	C
 	[Inline Frame] blender.exe!ed_undo_step_by_name(bContext *) Line 320	C
 	[Inline Frame] blender.exe!ED_undo_pop_op(bContext *) Line 369	C
 	blender.exe!ED_undo_operator_repeat(bContext * C, wmOperator * op) Line 646	C
 	blender.exe!ui_apply_but_funcs_after(bContext * C) Line 968	C
 	blender.exe!ui_handler_region_menu(bContext * C, const wmEvent * event, void * UNUSED_userdata) Line 10851	C
 	[Inline Frame] blender.exe!wm_handler_ui_call(bContext *) Line 643	C
 	blender.exe!wm_handlers_do_intern(bContext * C, wmEvent * event, ListBase * handlers) Line 2778	C
 	blender.exe!wm_handlers_do(bContext * C, wmEvent * event, ListBase * handlers) Line 2889	C
 	blender.exe!wm_event_do_handlers(bContext * C) Line 3312	C
 	blender.exe!WM_main(bContext * C) Line 638	C
 	blender.exe!main(int argc, const unsigned char * * UNUSED_argv_c) Line 532	C
 	[External Code]	
looks like heap corruption to MSVC but the issue goes away as soon as you add any kind of heap validation, so it's a race-y kind of corruption? neat! ``` Unhandled exception at 0x00007FFBE222F0F9 (ntdll.dll) in blender.exe: 0xC0000374: A heap has been corrupted (parameters: 0x00007FFBE22987F0). ``` my stack is radically different from the one in the opening post though. ``` ntdll.dll!RtlReportCriticalFailure() Unknown ntdll.dll!RtlpHeapHandleError() Unknown ntdll.dll!RtlpHpHeapHandleError() Unknown ntdll.dll!RtlpLogHeapFailure() Unknown ntdll.dll!RtlpLowFragHeapAllocFromContext() Unknown ntdll.dll!RtlpAllocateHeapInternal() Unknown ucrtbase.dll!_malloc_base() Unknown > blender.exe!MEM_lockfree_mallocN(unsigned __int64 len, const unsigned char * str) Line 276 C blender.exe!get_bhead(FileData * fd) Line 837 C [Inline Frame] blender.exe!blo_bhead_next(FileData *) Line 914 C [Inline Frame] blender.exe!read_file_dna(FileData *) Line 1028 C blender.exe!blo_decode_and_check(FileData * fd, ReportList * reports) Line 1286 C blender.exe!BLO_read_from_memfile(Main * oldmain, const unsigned char * filename, MemFile * memfile, const BlendFileReadParams * params, ReportList * reports) Line 421 C blender.exe!BKE_blendfile_read_from_memfile(bContext * C, MemFile * memfile, const BlendFileReadParams * params, ReportList * reports) Line 512 C blender.exe!BKE_memfile_undo_decode(MemFileUndoData * mfu, const int undo_direction, const bool use_old_bmain_data, bContext * C) Line 91 C blender.exe!memfile_undosys_step_decode(bContext * C, Main * bmain, UndoStep * us_p, int undo_direction, bool UNUSED_is_final) Line 197 C blender.exe!undosys_step_decode(bContext * C, Main * bmain, UndoStack * ustack, UndoStep * us, int dir, bool is_final) Line 214 C blender.exe!BKE_undosys_step_undo_with_data_ex(UndoStack * ustack, bContext * C, UndoStep * us, bool use_skip) Line 707 C blender.exe!ed_undo_step_impl(bContext * C, int step, const unsigned char * undoname, int undo_index, ReportList * reports) Line 245 C [Inline Frame] blender.exe!ed_undo_step_by_name(bContext *) Line 320 C [Inline Frame] blender.exe!ED_undo_pop_op(bContext *) Line 369 C blender.exe!ED_undo_operator_repeat(bContext * C, wmOperator * op) Line 646 C blender.exe!ui_apply_but_funcs_after(bContext * C) Line 968 C blender.exe!ui_handler_region_menu(bContext * C, const wmEvent * event, void * UNUSED_userdata) Line 10851 C [Inline Frame] blender.exe!wm_handler_ui_call(bContext *) Line 643 C blender.exe!wm_handlers_do_intern(bContext * C, wmEvent * event, ListBase * handlers) Line 2778 C blender.exe!wm_handlers_do(bContext * C, wmEvent * event, ListBase * handlers) Line 2889 C blender.exe!wm_event_do_handlers(bContext * C) Line 3312 C blender.exe!WM_main(bContext * C) Line 638 C blender.exe!main(int argc, const unsigned char * * UNUSED_argv_c) Line 532 C [External Code] ```
ankitm commented 2 years ago
Collaborator

Added subscriber: @ankitm

Added subscriber: @ankitm
ankitm commented 2 years ago
Collaborator

Are you able to reproduce this in a debug build with ASAN on Linux, because I haven't been able to?

Even the macOS one #80203 goes away if asan is enabled.

> Are you able to reproduce this in a debug build with ASAN on Linux, because I haven't been able to? Even the macOS one #80203 goes away if asan is enabled.
Collaborator

Added subscriber: @JacquesLucke

Added subscriber: @JacquesLucke
Collaborator

I couldn't figure out the root cause yet. However, I have some more information.

I get the same backtrace as @oweissbarth (subdiv_ccg was 0x1 for me as well).
Also I was able to reproduce the issue reliably in b71eb3a105, but not in the commit before that.
I was able to track down what change in that commit is responsible for breaking the given example file: 16 new bytes have been added to ID.
If I checkout b71eb3a105 (the commit before the one above), the test file works fine initially.
When I now apply this diff, the test file starts to fail.

diff --git a/source/blender/makesdna/DNA_ID.h b/source/blender/makesdna/DNA_ID.h
index f2d860a2851..c320128a48a 100644
--- a/source/blender/makesdna/DNA_ID.h
+++ b/source/blender/makesdna/DNA_ID.h
@@ -310,6 +310,7 @@ typedef struct ID {
   struct ID *orig_id;
 
   void *py_instance;
+  char some_data[16];
 } ID;
 
 /**

Instead of adding these bytes to ID I could also add them anywhere in the Mesh struct. Just adding 8 byte was not enough. I both cases many times, it was a very reliable way to reproduce the crash, only in release builds though.
Unfortunately, while interesting, this information is not enough to fix the bug yet.

I also tried creating lights and cameras instead of meshes in the test file, and was able to get similar but slightly different crashes. I didn't have enough time to bisect this issue in older commits yet, might do it tomorrow.

Furthermore, I wondered if there is maybe some offsetof call that is not recompiled when certain parts of dna change. That was not the case though, this should have been fixed by a clean compilation, but it wasn't.

I couldn't figure out the root cause yet. However, I have some more information. I get the same backtrace as @oweissbarth (`subdiv_ccg` was `0x1` for me as well). Also I was able to reproduce the issue reliably in b71eb3a105, but not in the commit before that. I was able to track down what change in that commit is responsible for breaking the given example file: 16 new bytes have been added to `ID`. If I checkout b71eb3a105 (the commit before the one above), the test file works fine initially. When I now apply this diff, the test file starts to fail. ``` diff --git a/source/blender/makesdna/DNA_ID.h b/source/blender/makesdna/DNA_ID.h index f2d860a2851..c320128a48a 100644 --- a/source/blender/makesdna/DNA_ID.h +++ b/source/blender/makesdna/DNA_ID.h @@ -310,6 +310,7 @@ typedef struct ID { struct ID *orig_id; void *py_instance; + char some_data[16]; } ID; /** ``` Instead of adding these bytes to `ID` I could also add them anywhere in the `Mesh` struct. Just adding 8 byte was not enough. I both cases many times, it was a very reliable way to reproduce the crash, only in release builds though. Unfortunately, while interesting, this information is not enough to fix the bug yet. I also tried creating lights and cameras instead of meshes in the test file, and was able to get similar but slightly different crashes. I didn't have enough time to bisect this issue in older commits yet, might do it tomorrow. Furthermore, I wondered if there is maybe some `offsetof` call that is not recompiled when certain parts of dna change. That was not the case though, this should have been fixed by a clean compilation, but it wasn't.
Collaborator

also small update on my end, the crash i'm seeing on windows appears to be a different one? I can repro it in all hashes mentioned in this ticket including the one listed as working.

also small update on my end, the crash i'm seeing on windows appears to be a different one? I can repro it in all hashes mentioned in this ticket including the one listed as working.
JulianEisel was unassigned by JacquesLucke 2 years ago
Collaborator

Added subscriber: @JulianEisel

Added subscriber: @JulianEisel
Collaborator

I just noticed that it might be related to rBL62457, which also isn't very useful...
I checked out 9db4e44961 + one of the patches below. This compiles with rBL62457 (broken) and rBL62402 (works). Maybe some cmake options have to be disabled if it does not work immediately.
Interestingly, the crash is still caused by subdiv_ccg being invalid, so the crash is very predictable.
Another weird thing is that when P1868 is applied, subdiv_ccg will always be 0x1.
When I apply P1869 instead. subdiv_ccg will always be 0x200200002002.
I don't know how that is possible. But I can reproduce this every time.

It would be interesting to see if @oweissbarth can reproduce my findings on his machine.

My system:
Operating system: Linux-5.9.14-arch1-1-x86_64-with-arch 64 Bits
Graphics card: AMD Radeon RX 5700 (NAVI10, DRM 3.39.0, 5.9.14-arch1-1, LLVM 11.0.0) AMD 4.6 (Core Profile) Mesa 20.3.1

I'm removing @JulianEisel as assignee, because while his commit introduced the error, the issue seems to be somewhere else.

Unfortunately, I don't know how to investigate any further currently. And while I found some interesting stuff, I'm not sure if this will actually be useful to solve the core issue.
The issue found by @LazyDodo looks quite different indeed. I have no idea if they are related, but it could well be, somehow. @LazyDodo, were you able to find the oldest commit that contains the error?

I just noticed that it might be related to rBL62457, which also isn't very useful... I checked out 9db4e44961 + one of the patches below. This compiles with rBL62457 (broken) and rBL62402 (works). Maybe some cmake options have to be disabled if it does not work immediately. Interestingly, the crash is still caused by `subdiv_ccg` being invalid, so the crash is very predictable. Another weird thing is that when [P1868](https://archive.blender.org/developer/P1868.txt) is applied, `subdiv_ccg` will always be `0x1`. When I apply [P1869](https://archive.blender.org/developer/P1869.txt) instead. `subdiv_ccg` will always be `0x200200002002`. I don't know how that is possible. But I can reproduce this every time. It would be interesting to see if @oweissbarth can reproduce my findings on his machine. My system: Operating system: Linux-5.9.14-arch1-1-x86_64-with-arch 64 Bits Graphics card: AMD Radeon RX 5700 (NAVI10, DRM 3.39.0, 5.9.14-arch1-1, LLVM 11.0.0) AMD 4.6 (Core Profile) Mesa 20.3.1 I'm removing @JulianEisel as assignee, because while his commit introduced the error, the issue seems to be somewhere else. Unfortunately, I don't know how to investigate any further currently. And while I found some interesting stuff, I'm not sure if this will actually be useful to solve the core issue. The issue found by @LazyDodo looks quite different indeed. I have no idea if they are related, but it could well be, somehow. @LazyDodo, were you able to find the oldest commit that contains the error?
Collaborator

Yes, b852db57ba, which is not terribly useful in tracking down the origin of the corruption , the issues seem "different" yet somehow connected, this is fun :)

Yes, b852db57ba, which is not terribly useful in tracking down the origin of the corruption , the issues seem "different" yet somehow connected, this is fun :)
Poster

I tested it and i can confirm your findings.

Blender revision lib revision patch result output
9db4e44961 rBL62457 P1868 crashing 0x1
9db4e44961 rBL62457 P1869 crashing 0x200200002002
9db4e44961 rBL62402 P1868 not crashing
9db4e44961 rBL62402 P1869 not crashing

I also noticed that the does not also ways happen in BKE_subdiv_ccg_destroy. I also got crashes in IDP_foreach_property(less often).

I tested it and i can confirm your findings. | **Blender revision** | **lib revision** |**patch**| **result** | **output**| | -- | -- | -- | -- | -- | | 9db4e44961 | rBL62457 | [P1868](https://archive.blender.org/developer/P1868.txt) | crashing | 0x1 | | 9db4e44961 | rBL62457 | [P1869](https://archive.blender.org/developer/P1869.txt) | crashing | 0x200200002002 | | 9db4e44961 | rBL62402 | [P1868](https://archive.blender.org/developer/P1868.txt) | not crashing | | | 9db4e44961 | rBL62402 | [P1869](https://archive.blender.org/developer/P1869.txt) | not crashing | | I also noticed that the does not also ways happen in `BKE_subdiv_ccg_destroy`. I also got crashes in `IDP_foreach_property`(less often).
Collaborator

I tested it and i can confirm your findings.

Great thanks, that's good to know! Still don't know how to continue from here..

> I tested it and i can confirm your findings. Great thanks, that's good to know! Still don't know how to continue from here..
Collaborator

P1870: (An Untitled Masterwork)

import bpy

class CauseCrashOperator(bpy.types.Operator):
    bl_idname = "object.causecrash"
    bl_label = "causecrash"
    bl_options = {'REGISTER', 'UNDO'}

    count : bpy.props.IntProperty(name="count", default=800)

    def execute(self, context):
        for i in range(self.count):
            mesh = bpy.data.meshes.new("myobj")
            obj = bpy.data.objects.new("myobj", mesh)

            context.collection.objects.link(obj)
        return {'FINISHED'}

def register():
    bpy.utils.register_class(CauseCrashOperator)

register()

print("1")
bpy.ops.ed.undo_push()
print("2")
bpy.ops.object.causecrash(count=800)
print("3")
bpy.ops.ed.undo()
print("4")
bpy.ops.object.causecrash(count=800)
print("5")

makes it crash reliably for me rather than jumping all over the place, if you take out the context.collection.objects.link(obj) line the crash goes away so kinda feels there's definitely an issue there, is it the same issue we have been chasing? no idea! could be something unrelated.....or not...

[P1870: (An Untitled Masterwork)](https://archive.blender.org/developer/P1870.txt) ``` import bpy class CauseCrashOperator(bpy.types.Operator): bl_idname = "object.causecrash" bl_label = "causecrash" bl_options = {'REGISTER', 'UNDO'} count : bpy.props.IntProperty(name="count", default=800) def execute(self, context): for i in range(self.count): mesh = bpy.data.meshes.new("myobj") obj = bpy.data.objects.new("myobj", mesh) context.collection.objects.link(obj) return {'FINISHED'} def register(): bpy.utils.register_class(CauseCrashOperator) register() print("1") bpy.ops.ed.undo_push() print("2") bpy.ops.object.causecrash(count=800) print("3") bpy.ops.ed.undo() print("4") bpy.ops.object.causecrash(count=800) print("5") ``` makes it crash reliably for me rather than jumping all over the place, if you take out the `context.collection.objects.link(obj)` line the crash goes away so kinda feels there's definitely an issue there, is it the same issue we have been chasing? no idea! could be something unrelated.....or not...
Collaborator

Added subscriber: @Sergey

Added subscriber: @Sergey
Collaborator

some good news some bad news

bad : the script above seems to be a different issue than what i had been chasing before
good : I managed to capture the heap corruption
bad : Beyond a rough indication where the corruption is occuring i'm not any closer to understanding it

so here we go :)

I finally managed to capture a crash and did a quick diagnostic of the heap corruption using windbg's timetravel feature (imagine a debugger where you can step backwards and forwards)

The process terminates after RtlpLowFragHeapAllocFromContext detects a corrupted heap and calls RtlpLogHeapFailure to report it

 # RetAddr               : Args to Child                                                           : Call Site
00 00007ff9`aaaef140     : 00007ff9`aab400f4 000000b8`70bfe740 00000000`00000000 00000000`00000000 : ntdll!NtTerminateProcess+0x12
01 00007ff9`aaa7bb16     : 00007ff9`aab6e26c 00007ff9`aa9f0000 000000b8`70bfd880 00007ff9`aaa1225b : ntdll!RtlReportFatalFailure$filt$0+0x3f
02 00007ff9`aaa9130f     : 00000000`00000000 000000b8`70bfdd80 000000b8`70bfe7c0 00000000`00000000 : ntdll!_C_specific_handler+0x96
03 00007ff9`aaa3b5e4     : 00000000`00000000 000000b8`70bfdd80 000000b8`70bfe7c0 00000000`00000001 : ntdll!RtlpExecuteHandlerForException+0xf
04 00007ff9`aaa3b335     : 00000000`00000000 000000b8`70bfe600 00000000`00000000 000000b8`70bfdf90 : ntdll!RtlDispatchException+0x244
05 00007ff9`aaaef0f9     : 00000000`00000000 00000000`c0000374 00000000`00000001 000000b8`70beb000 : ntdll!RtlRaiseException+0x185
06 00007ff9`aaaef0c3     : 00007ff9`aab365b4 000000b8`70bffa30 000000b8`0000005e 00000000`0000005e : ntdll!RtlReportFatalFailure+0x9
07 00007ff9`aaaf7e42     : 1c46f348`00000000 00007ff9`aab587f0 00000000`0000000f 0000029f`e20c0000 : ntdll!RtlReportCriticalFailure+0x97
08 00007ff9`aaaf812a     : 00000000`0000000f 00000000`000005d0 0000029f`e20c0000 00007ff9`aa9fc8ec : ntdll!RtlpHeapHandleError+0x12
09 00007ff9`aaafdd61     : 0000029f`8eb6d740 00000000`0000008d 0000029f`e2292700 0000029f`87a45010 : ntdll!RtlpHpHeapHandleError+0x7a
0a 00007ff9`aaa9dc5f     : 00000000`00580057 00000000`0000008d 0000029f`fafec820 00000000`02000002 : ntdll!RtlpLogHeapFailure+0x45
0b 00007ff9`aaa06e2c     : 0000029f`e2290000 0000029f`00000000 00000000`000005c8 00000000`00000000 : ntdll!RtlpLowFragHeapAllocFromContext+0x9660f
0c 00007ff9`a813f706     : 00000000`00000000 00000000`000005c8 00000000`00000032 00000000`00000000 : ntdll!RtlpAllocateHeapInternal+0x12c
0d 00007ff6`bc8e4e34     : 00000000`000005c0 00000000`00000000 0000029f`8f1d1268 0000029f`faf93158 : ucrtbase!_malloc_base+0x36
0e 00007ff6`bc0cb6b2     : 0000029f`f2a2b388 00000000`00000588 000000b8`70bfebd0 00000000`00000000 : blender!MEM_lockfree_mallocN+0x24 [K:\BlenderGit\blender\intern\guardedalloc\intern\mallocn_lockfree_impl.c @ 276] 
0f 00007ff6`bc0c8439     : 0000029f`ffbbdf88 00000000`00000000 0000029f`8eb6c038 000000b8`70bfed00 : blender!get_bhead+0x342 [K:\BlenderGit\blender\source\blender\blenloader\intern\readfile.c @ 837] 
10 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!blo_bhead_next+0x11 [K:\BlenderGit\blender\source\blender\blenloader\intern\readfile.c @ 914] 
11 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!read_file_dna+0x8c [K:\BlenderGit\blender\source\blender\blenloader\intern\readfile.c @ 1028] 
12 00007ff6`bc0d5fd9     : 000000b8`70bfed00 0000029f`8f1d1268 000000b8`70bfed00 0000029f`8f1d1278 : blender!blo_decode_and_check+0xb9 [K:\BlenderGit\blender\source\blender\blenloader\intern\readfile.c @ 1286] 
13 00007ff6`bbedea8c     : 0000029f`8f1d1268 00000000`00000000 000000b8`70bfed00 0000029f`ffbbdf88 : blender!BLO_read_from_memfile+0x39 [K:\BlenderGit\blender\source\blender\blenloader\intern\readblenentry.c @ 421] 
14 00007ff6`bc993687     : 00000000`ffffffff 00000000`02001000 0000029f`e20c7048 00000000`00000001 : blender!BKE_blendfile_read_from_memfile+0x4c [K:\BlenderGit\blender\source\blender\blenkernel\intern\blendfile.c @ 512] 
15 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : blender!BKE_memfile_undo_decode+0x97 [K:\BlenderGit\blender\source\blender\blenkernel\intern\blender_undo.c @ 91] 

so lets break on RtlpLogHeapFailure and run backwards until we hit the breakpoint

bp ntdll!RtlpLogHeapFailure
g-

a few instruction steps back gets us to this bit of code in RtlpLowFragHeapAllocFromContext

00007ff9`aaa07888 330db24e1500         xor     ecx, dword ptr [ntdll!RtlpLFHKey (00007ff9`aab5c740)]
00007ff9`aaa0788e 8bc1                 mov     eax, ecx
00007ff9`aaa07890 0fb7d9               movzx   ebx, cx
00007ff9`aaa07893 c1e810               shr     eax, 10h
00007ff9`aaa07896 410fafc0             imul    eax, r8d
00007ff9`aaa0789a 4903c7               add     rax, r15
00007ff9`aaa0789d 4803d8               add     rbx, rax
00007ff9`aaa078a0 f6430f3f             test    byte ptr [rbx+0Fh], 3Fh
00007ff9`aaa078a4 0f8588630900         jne     ntdll!RtlpLowFragHeapAllocFromContext+0x965e2 (00007ff9`aaa9dc32) [br=1] // Jump to setup for RtlpLogHeapFailure
00007ff9`aaa078aa 837c243000           cmp     dword ptr [rsp+30h], 0

rbx+0Fh is getting compared for a sane value it is not deemed sane and we terminate, allright, let see what's there

:000> db rbx+0xf
0000029f`8eb6d74f  ff ff ff ff ff ff ff ff-ff ff ff ff ff 00 00 00  ................
0000029f`8eb6d75f  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................
0000029f`8eb6d76f  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................
0000029f`8eb6d77f  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................
0000029f`8eb6d78f  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................
0000029f`8eb6d79f  00 00 00 00 00 00 00 00-00 00 00 80 3f 00 00 80  ............?...
0000029f`8eb6d7af  3f 00 00 80 3f 03 00 00-60 92 0a 06 3f 00 00 00  ?...?...`...?...
0000029f`8eb6d7bf  00 00 00 00 00 cd cc cc-3d 00 00 00 00 00 00 00  ........=.......

well that's awesome, but how did that 0xff get there? well given we can see through time, that is not too hard of a question to answer

0:000> dx -g @$cursession.TTD.Memory(0x0000029f8eb6d74f,0x0000029f8eb6d750, "w")
=================================================================================================================================================================================================================================================================
=          = (+) EventType = (+) ThreadId = (+) UniqueThreadId = (+) TimeStart  = (+) SystemTimeStart             = (+) TimeEnd    = (+) SystemTimeEnd               = (+) AccessType = (+) IP            = (+) Address      = (+) Size = (+) Value             =
=================================================================================================================================================================================================================================================================
= [0x0]    - 0x1           - 0x5750       - 0x2                - 316748:78C     - January 7, 2021 01:13:34.706    - 316748:78C     - January 7, 2021 01:13:34.706    - Write          - 0x7ff9aa9fc5ef    - 0x29f8eb6d74c    - 0x4      - 0x0                   =
= [0x1]    - 0x1           - 0x5750       - 0x2                - 316748:78D     - January 7, 2021 01:13:34.706    - 316748:78D     - January 7, 2021 01:13:34.706    - Write          - 0x7ff9aa9fc5f6    - 0x29f8eb6d74c    - 0x4      - 0x5c00                =
= [0x2]    - 0x1           - 0x5750       - 0x2                - 316748:78F     - January 7, 2021 01:13:34.706    - 316748:78F     - January 7, 2021 01:13:34.706    - Write          - 0x7ff9aa9fc5fd    - 0x29f8eb6d74f    - 0x1      - 0x80                  =
= [0x3]    - 0x1           - 0x5750       - 0x2                - 318A06:11E     - .. 00:00:00.0                   - 318A06:11E     - .. 00:00:00.0                   - Write          - 0x7ff9aaa07a47    - 0x29f8eb6d74f    - 0x1      - 0xbf                  =
= [0x4]    - 0x1           - 0x5750       - 0x2                - 526B8F:2       - January 7, 2021 01:14:05.451    - 526B8F:2       - January 7, 2021 01:14:05.451    - Write          - 0x7ff9aaa95ec3    - 0x29f8eb6d74f    - 0x1      - 0x80                  =
= [0x5]    - 0x1           - 0x5750       - 0x2                - 57E13F:ED0     - January 7, 2021 01:14:08.576    - 57E13F:ED0     - January 7, 2021 01:14:08.576    - Write          - 0x7ff98e7c16e9    - 0x29f8eb6d74f    - 0x1      - 0x0                   =
= [0x6]    - 0x1           - 0x5750       - 0x2                - 57E13F:15FD    - January 7, 2021 01:14:08.576    - 57E13F:15FD    - January 7, 2021 01:14:08.576    - Write          - 0x7ff98e7c12de    - 0x29f8eb6d74f    - 0x1      - 0xff                  =
= [0x7]    - 0x1           - 0x5750       - 0x2                - 57E147:5CC     - January 7, 2021 01:14:08.576    - 57E147:5CC     - January 7, 2021 01:14:08.576    - Write          - 0x7ff98e7c16e9    - 0x29f8eb6d74f    - 0x1      - 0x0                   =
= [0x8]    - 0x1           - 0x5750       - 0x2                - 57E147:60E     - January 7, 2021 01:14:08.576    - 57E147:60E     - January 7, 2021 01:14:08.576    - Write          - 0x7ff6bc88de90    - 0x29f8eb6d74c    - 0x4      - 0xffffffff            =
= [0x9]    - 0x1           - 0x5750       - 0x2                - 57E147:714     - January 7, 2021 01:14:08.576    - 57E147:714     - January 7, 2021 01:14:08.576    - Write          - 0x7ff6bbf07588    - 0x29f8eb6d748    - 0x8      - 0xffffffffffffffff    =
=================================================================================================================================================================================================================================================================

9 writes in total, lets look at the stacks

frame 0/1/2/3 : deg allocates and RtlpAllocateHeapInternal seemingly sets some housekeeping vars

0:000> kb
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ff9`aaa07f5e     : 00000000`00000000 0000029f`fafec820 0000029f`fafec820 00000000`000005d0 : ntdll!RtlpSubSegmentInitialize+0xed
01 00007ff9`aaa06e2c     : 0000029f`e2290000 0000029f`00000000 00000000`00000590 0000029f`00000008 : ntdll!RtlpLowFragHeapAllocFromContext+0x90e
02 00007ff9`a813d65e     : 00000000`00000000 00000000`00000590 00000000`00000000 00000000`00000000 : ntdll!RtlpAllocateHeapInternal+0x12c
03 00007ff6`bc8e4b7b     : 00000000`00000001 0000029f`fed976b0 00000000`00000000 00007ff6`bc856a66 : ucrtbase!_calloc_base+0x4e
04 00007ff6`bc85e9d7     : 0000029f`8ea720f8 0000029f`8ea720f8 0000029f`f2a293c8 0000029f`fed976b0 : blender!MEM_lockfree_callocN+0x2b [K:\BlenderGit\blender\intern\guardedalloc\intern\mallocn_lockfree_impl.c @ 235] 
05 00007ff6`bc84eadd     : 0000029f`f2a25328 0000029f`f2a25328 00000000`00000000 00000000`00000000 : blender!blender::deg::IDNode::init_copy_on_write+0x57 [K:\BlenderGit\blender\source\blender\depsgraph\intern\node\deg_node_id.cc @ 115] 
06 00007ff6`bc86cf81     : 00000000`00000000 0000029f`fed97688 00000000`00000000 00000000`00000001 : blender!blender::deg::Depsgraph::add_id_node+0x5d [K:\BlenderGit\blender\source\blender\depsgraph\intern\depsgraph.cc @ 128] 
07 00007ff6`bc86f86a     : 0000029f`fac0a270 0000029f`fed97688 0000029f`fac0a270 000000b8`70bff531 : blender!blender::deg::DepsgraphNodeBuilder::add_id_node+0x111 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\deg_builder_nodes.cc @ 169] 
08 00007ff6`bc87a61b     : 00000000`00000020 00007ff6`bc21714c 0000029f`fac0a270 00000000`00000000 : blender!blender::deg::DepsgraphNodeBuilder::build_object+0xda [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\deg_builder_nodes.cc @ 584] 
09 00007ff6`bc869d76     : 000000b8`70bff6d0 000000b8`70bff6d0 0000029f`ff5d2588 0000029f`f2a25328 : blender!blender::deg::DepsgraphNodeBuilder::build_view_layer+0xab [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\deg_builder_nodes_view_layer.cc @ 118] 
0a (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!blender::deg::AbstractBuilderPipeline::build_step_nodes+0x28 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\pipeline.cc @ 76] 
0b 00007ff6`bc84fc09     : 0000029f`fac0a270 00000000`00000000 00000000`00000000 00007ff6`bc9d31b8 : blender!blender::deg::AbstractBuilderPipeline::build+0x56 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\pipeline.cc @ 55] 
0c (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!DEG_graph_build_from_view_layer+0x17 [K:\BlenderGit\blender\source\blender\depsgraph\intern\depsgraph_build.cc @ 228] 
0d 00007ff6`bbed5718     : 0000029f`f2a293c8 00007ff6`bbe9ab31 00000000`00000000 000000b8`70bff7a0 : blender!DEG_graph_relations_update+0x39 [K:\BlenderGit\blender\source\blender\depsgraph\intern\depsgraph_build.cc @ 281] 
0e 00007ff6`bbe1ecb4     : 00000000`00000000 0000029f`e20c7048 0000029f`f2a25328 00000000`00000000 : blender!scene_graph_update_tagged+0xa8 [K:\BlenderGit\blender\source\blender\blenkernel\intern\scene.c @ 2607] 
0f (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_depsgraph+0xf6 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 364] 
10 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_refresh_wm_and_depsgraph+0x158 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 389] 
11 00007ff6`bbe095c8     : 0000029f`e20c7048 000000b8`70bff9a8 00000000`00000000 0000029f`e20d90b0 : blender!wm_event_do_notifiers+0x644 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 571] 
12 00007ff6`bbe051ef     : 0000029f`e20c7048 00000000`00000000 00000000`00000000 00000000`00000001 : blender!WM_main+0x28 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm.c @ 641] 
13 00007ff6`bc973554     : 00000000`00000001 00000000`00000000 0000029f`e20d90b0 00000000`00000000 : blender!main+0x39f [K:\BlenderGit\blender\source\creator\creator.c @ 527] 
14 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!invoke_main+0x22 [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78] 
15 00007ff9`a9d17034     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : blender!__scrt_common_main_seh+0x10c [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 
16 00007ff9`aaa3d0d1     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14
17 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21

frame 4 : DepsgraphNodeBuilder's dtor frees some ram, and the heap updates some of the house keeping flags, seems fair..

0:000> kb
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ff9`aaa05d21     : 00000000`00000000 0000029f`e20c0000 00000000`00000001 00000000`00000000 : ntdll!RtlpFreeHeapInternal+0x8cd63
01 00007ff9`a813e97b     : 00000000`000002b3 0000029f`87d8e030 0000029f`87d8e030 0000029f`f292e770 : ntdll!RtlFreeHeap+0x51
02 00007ff6`bc86c5f1     : 00000000`000002b3 00007ff6`00000000 0000029f`f292e738 0000029f`8f1d1268 : ucrtbase!_free_base+0x1b
03 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!blender::deg::DepsgraphNodeBuilder::{dtor}+0x64 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\deg_builder_nodes.cc @ 147] 
04 00007ff6`bc869d96     : 000000b8`70bff6d0 0000029f`8f1d1268 0000029f`f2a25328 0000029f`f2a25328 : blender!blender::deg::DepsgraphNodeBuilder::`scalar deleting destructor'+0x81
05 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!std::default_delete<blender::deg::DepsgraphNodeBuilder>::operator()+0xa [C:\Program Files (x86)\Microsoft Visual Studio\2019\Preview\VC\Tools\MSVC\14.28.29617\include\memory @ 3285] 
06 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!std::unique_ptr<blender::deg::DepsgraphNodeBuilder,std::default_delete<blender::deg::DepsgraphNodeBuilder> >::{dtor}+0x14 [C:\Program Files (x86)\Microsoft Visual Studio\2019\Preview\VC\Tools\MSVC\14.28.29617\include\memory @ 3395] 
07 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!blender::deg::AbstractBuilderPipeline::build_step_nodes+0x48 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\pipeline.cc @ 78] 
08 00007ff6`bc84fc09     : 0000029f`fac09730 00000000`00000000 00000000`00000000 00007ff6`bc9d31b8 : blender!blender::deg::AbstractBuilderPipeline::build+0x76 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\pipeline.cc @ 56] 
09 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!DEG_graph_build_from_view_layer+0x17 [K:\BlenderGit\blender\source\blender\depsgraph\intern\depsgraph_build.cc @ 228] 
0a 00007ff6`bbed5718     : 0000029f`8fb0dd98 00007ff6`bbe9ab31 00000000`00000000 000000b8`70bff7a0 : blender!DEG_graph_relations_update+0x39 [K:\BlenderGit\blender\source\blender\depsgraph\intern\depsgraph_build.cc @ 281] 
0b 00007ff6`bbe1ecb4     : 00000000`00000000 0000029f`e20c7048 0000029f`f2a25328 00000000`00000000 : blender!scene_graph_update_tagged+0xa8 [K:\BlenderGit\blender\source\blender\blenkernel\intern\scene.c @ 2607] 
0c (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_depsgraph+0xf6 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 364] 
0d (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_refresh_wm_and_depsgraph+0x158 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 389] 
0e 00007ff6`bbe095c8     : 0000029f`e20c7048 000000b8`70bff9a8 00000000`00000000 0000029f`e20d90b0 : blender!wm_event_do_notifiers+0x644 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 571] 
0f 00007ff6`bbe051ef     : 0000029f`e20c7048 00000000`00000000 00000000`00000000 00000000`00000001 : blender!WM_main+0x28 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm.c @ 641] 
10 00007ff6`bc973554     : 00000000`00000001 00000000`00000000 0000029f`e20d90b0 00000000`00000000 : blender!main+0x39f [K:\BlenderGit\blender\source\creator\creator.c @ 527] 
11 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!invoke_main+0x22 [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78] 
12 00007ff9`a9d17034     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : blender!__scrt_common_main_seh+0x10c [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 
13 00007ff9`aaa3d0d1     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14
14 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21

frame 5/6/7/8/9: we copy some data to it? which is odd, since this was clearly in an internal house keeping area of the heap, not an area we ought to be writing to, it writes to this address multiple times in various stages of BKE_id_copy_ex this is the stack from frame 9 but all originate from BKE_id_copy_ex

0:000> kb
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ff6`bbf069a5     : 0000029f`f2a25328 0000029f`8f2e5318 00000000`00000000 00000000`00000000 : blender!CustomData_update_typemap+0xd8 [K:\BlenderGit\blender\source\blender\blenkernel\intern\customdata.c @ 2073] 
01 00007ff6`bbf4bf99     : 000000b8`70bfd830 0000029f`8eb6d688 00078208`040b1c00 0000029f`8eb6d188 : blender!CustomData_merge+0x235 [K:\BlenderGit\blender\source\blender\blenkernel\intern\customdata.c @ 2171] 
02 00007ff6`bbecb8ac     : 0000029f`8eb6d188 00007ff6`bd408cd0 000000b8`70bfd2f8 000000b8`70bfd330 : blender!mesh_copy_data+0x139 [K:\BlenderGit\blender\source\blender\blenkernel\intern\mesh.c @ 132] 
03 00007ff6`bc85c8aa     : 0000029f`8eb6d188 0000029f`8eb6d188 0000029f`8f2e5318 00000000`00000000 : blender!BKE_id_copy_ex+0xbc [K:\BlenderGit\blender\source\blender\blenkernel\intern\lib_id.c @ 605] 
04 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!blender::deg::?A0xef58984b::id_copy_inplace_no_main+0x47 [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval_copy_on_write.cc @ 304] 
05 00007ff6`bc85cc0a     : 0000029f`f2a25328 0000029f`8f2e5318 00000000`00000000 00000000`00000000 : blender!blender::deg::deg_expand_copy_on_write_datablock+0x26a [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval_copy_on_write.cc @ 899] 
06 00007ff6`bc85c629     : 00000000`0000424f 0000029f`8eda80b0 0000029f`8f2e5318 0000029f`8eda8f78 : blender!blender::deg::deg_update_copy_on_write_datablock+0x9a [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval_copy_on_write.cc @ 954] 
07 00007ff6`bc86aff4     : 0000029f`8eda8f78 00000000`00000000 00000000`00000000 00007ff6`bc8e4e34 : blender!blender::deg::deg_evaluate_copy_on_write+0x49 [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval_copy_on_write.cc @ 1089] 
08 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!std::_Func_class<void,Depsgraph *>::operator()+0x1c [C:\Program Files (x86)\Microsoft Visual Studio\2019\Preview\VC\Tools\MSVC\14.28.29617\include\functional @ 986] 
09 00007ff6`bc86af48     : 0000029f`f2a25328 000000b8`70bff760 0000029f`f2a25328 0000029f`901aa578 : blender!blender::deg::`anonymous namespace'::evaluate_node+0x84 [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval.cc @ 115] 
0a 00007ff6`bc8bd797     : 0000029f`8fbd9d20 00000000`00000000 000000b8`70bff7b8 00007ff6`bc8bd4ea : blender!blender::deg::`anonymous namespace'::deg_task_run_func+0x28 [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval.cc @ 127] 
0b (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!Task::operator()+0xa [K:\BlenderGit\blender\source\blender\blenlib\intern\task_pool.cc @ 120] 
0c (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!tbb_task_pool_run+0x55 [K:\BlenderGit\blender\source\blender\blenlib\intern\task_pool.cc @ 226] 
0d (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!tbb_task_pool_work_and_wait+0x90 [K:\BlenderGit\blender\source\blender\blenlib\intern\task_pool.cc @ 239] 
0e 00007ff6`bc86ac7a     : 0000029f`901aa578 00000000`00000000 000000b8`70bff7b8 0000029f`f2a25328 : blender!BLI_task_pool_work_and_wait+0xd7 [K:\BlenderGit\blender\source\blender\blenlib\intern\task_pool.cc @ 499] 
0f 00007ff6`bbed5841     : 0000029f`8fb0dd98 00007ff6`bbe9ab31 00000000`00000000 000000b8`70bff7a0 : blender!blender::deg::deg_evaluate_on_refresh+0x1ca [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval.cc @ 391] 
10 00007ff6`bbe1ecb4     : 00000000`00000000 0000029f`e20c7048 0000029f`f2a25328 00000000`00000000 : blender!scene_graph_update_tagged+0x1d1 [K:\BlenderGit\blender\source\blender\blenkernel\intern\scene.c @ 2612] 
11 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_depsgraph+0xf6 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 364] 
12 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_refresh_wm_and_depsgraph+0x158 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 389] 
13 00007ff6`bbe095c8     : 0000029f`e20c7048 000000b8`70bff9a8 00000000`00000000 0000029f`e20d90b0 : blender!wm_event_do_notifiers+0x644 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 571] 
14 00007ff6`bbe051ef     : 0000029f`e20c7048 00000000`00000000 00000000`00000000 00000000`00000001 : blender!WM_main+0x28 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm.c @ 641] 
15 00007ff6`bc973554     : 00000000`00000001 00000000`00000000 0000029f`e20d90b0 00000000`00000000 : blender!main+0x39f [K:\BlenderGit\blender\source\creator\creator.c @ 527] 
16 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : blender!invoke_main+0x22 [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78] 
17 00007ff9`a9d17034     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : blender!__scrt_common_main_seh+0x10c [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 
18 00007ff9`aaa3d0d1     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14
19 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21```

So I think it's safe to say the DEG "sometimes" touches ram it shouldn't be touching, but i can't say i have any more insights into why it is doing that or why the regular tools like asan or the page heap make the problem go away...

@sergey any clever ideas how to tackle this one? 

some good news some bad news bad : the script above seems to be a different issue than what i had been chasing before good : I managed to capture the heap corruption bad : Beyond a rough indication where the corruption is occuring i'm not any closer to understanding it so here we go :) I finally managed to capture a crash and did a quick diagnostic of the heap corruption using windbg's [timetravel feature](https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/time-travel-debugging-overview) (imagine a debugger where you can step backwards and forwards) The process terminates after RtlpLowFragHeapAllocFromContext detects a corrupted heap and calls RtlpLogHeapFailure to report it ``` # RetAddr : Args to Child : Call Site 00 00007ff9`aaaef140 : 00007ff9`aab400f4 000000b8`70bfe740 00000000`00000000 00000000`00000000 : ntdll!NtTerminateProcess+0x12 01 00007ff9`aaa7bb16 : 00007ff9`aab6e26c 00007ff9`aa9f0000 000000b8`70bfd880 00007ff9`aaa1225b : ntdll!RtlReportFatalFailure$filt$0+0x3f 02 00007ff9`aaa9130f : 00000000`00000000 000000b8`70bfdd80 000000b8`70bfe7c0 00000000`00000000 : ntdll!_C_specific_handler+0x96 03 00007ff9`aaa3b5e4 : 00000000`00000000 000000b8`70bfdd80 000000b8`70bfe7c0 00000000`00000001 : ntdll!RtlpExecuteHandlerForException+0xf 04 00007ff9`aaa3b335 : 00000000`00000000 000000b8`70bfe600 00000000`00000000 000000b8`70bfdf90 : ntdll!RtlDispatchException+0x244 05 00007ff9`aaaef0f9 : 00000000`00000000 00000000`c0000374 00000000`00000001 000000b8`70beb000 : ntdll!RtlRaiseException+0x185 06 00007ff9`aaaef0c3 : 00007ff9`aab365b4 000000b8`70bffa30 000000b8`0000005e 00000000`0000005e : ntdll!RtlReportFatalFailure+0x9 07 00007ff9`aaaf7e42 : 1c46f348`00000000 00007ff9`aab587f0 00000000`0000000f 0000029f`e20c0000 : ntdll!RtlReportCriticalFailure+0x97 08 00007ff9`aaaf812a : 00000000`0000000f 00000000`000005d0 0000029f`e20c0000 00007ff9`aa9fc8ec : ntdll!RtlpHeapHandleError+0x12 09 00007ff9`aaafdd61 : 0000029f`8eb6d740 00000000`0000008d 0000029f`e2292700 0000029f`87a45010 : ntdll!RtlpHpHeapHandleError+0x7a 0a 00007ff9`aaa9dc5f : 00000000`00580057 00000000`0000008d 0000029f`fafec820 00000000`02000002 : ntdll!RtlpLogHeapFailure+0x45 0b 00007ff9`aaa06e2c : 0000029f`e2290000 0000029f`00000000 00000000`000005c8 00000000`00000000 : ntdll!RtlpLowFragHeapAllocFromContext+0x9660f 0c 00007ff9`a813f706 : 00000000`00000000 00000000`000005c8 00000000`00000032 00000000`00000000 : ntdll!RtlpAllocateHeapInternal+0x12c 0d 00007ff6`bc8e4e34 : 00000000`000005c0 00000000`00000000 0000029f`8f1d1268 0000029f`faf93158 : ucrtbase!_malloc_base+0x36 0e 00007ff6`bc0cb6b2 : 0000029f`f2a2b388 00000000`00000588 000000b8`70bfebd0 00000000`00000000 : blender!MEM_lockfree_mallocN+0x24 [K:\BlenderGit\blender\intern\guardedalloc\intern\mallocn_lockfree_impl.c @ 276] 0f 00007ff6`bc0c8439 : 0000029f`ffbbdf88 00000000`00000000 0000029f`8eb6c038 000000b8`70bfed00 : blender!get_bhead+0x342 [K:\BlenderGit\blender\source\blender\blenloader\intern\readfile.c @ 837] 10 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!blo_bhead_next+0x11 [K:\BlenderGit\blender\source\blender\blenloader\intern\readfile.c @ 914] 11 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!read_file_dna+0x8c [K:\BlenderGit\blender\source\blender\blenloader\intern\readfile.c @ 1028] 12 00007ff6`bc0d5fd9 : 000000b8`70bfed00 0000029f`8f1d1268 000000b8`70bfed00 0000029f`8f1d1278 : blender!blo_decode_and_check+0xb9 [K:\BlenderGit\blender\source\blender\blenloader\intern\readfile.c @ 1286] 13 00007ff6`bbedea8c : 0000029f`8f1d1268 00000000`00000000 000000b8`70bfed00 0000029f`ffbbdf88 : blender!BLO_read_from_memfile+0x39 [K:\BlenderGit\blender\source\blender\blenloader\intern\readblenentry.c @ 421] 14 00007ff6`bc993687 : 00000000`ffffffff 00000000`02001000 0000029f`e20c7048 00000000`00000001 : blender!BKE_blendfile_read_from_memfile+0x4c [K:\BlenderGit\blender\source\blender\blenkernel\intern\blendfile.c @ 512] 15 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : blender!BKE_memfile_undo_decode+0x97 [K:\BlenderGit\blender\source\blender\blenkernel\intern\blender_undo.c @ 91] ``` so lets break on RtlpLogHeapFailure and run backwards until we hit the breakpoint ``` bp ntdll!RtlpLogHeapFailure g- ``` a few instruction steps back gets us to this bit of code in RtlpLowFragHeapAllocFromContext ``` 00007ff9`aaa07888 330db24e1500 xor ecx, dword ptr [ntdll!RtlpLFHKey (00007ff9`aab5c740)] 00007ff9`aaa0788e 8bc1 mov eax, ecx 00007ff9`aaa07890 0fb7d9 movzx ebx, cx 00007ff9`aaa07893 c1e810 shr eax, 10h 00007ff9`aaa07896 410fafc0 imul eax, r8d 00007ff9`aaa0789a 4903c7 add rax, r15 00007ff9`aaa0789d 4803d8 add rbx, rax 00007ff9`aaa078a0 f6430f3f test byte ptr [rbx+0Fh], 3Fh 00007ff9`aaa078a4 0f8588630900 jne ntdll!RtlpLowFragHeapAllocFromContext+0x965e2 (00007ff9`aaa9dc32) [br=1] // Jump to setup for RtlpLogHeapFailure 00007ff9`aaa078aa 837c243000 cmp dword ptr [rsp+30h], 0 ``` `rbx+0Fh` is getting compared for a sane value it is not deemed sane and we terminate, allright, let see what's there ``` :000> db rbx+0xf 0000029f`8eb6d74f ff ff ff ff ff ff ff ff-ff ff ff ff ff 00 00 00 ................ 0000029f`8eb6d75f 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................ 0000029f`8eb6d76f 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................ 0000029f`8eb6d77f 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................ 0000029f`8eb6d78f 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................ 0000029f`8eb6d79f 00 00 00 00 00 00 00 00-00 00 00 80 3f 00 00 80 ............?... 0000029f`8eb6d7af 3f 00 00 80 3f 03 00 00-60 92 0a 06 3f 00 00 00 ?...?...`...?... 0000029f`8eb6d7bf 00 00 00 00 00 cd cc cc-3d 00 00 00 00 00 00 00 ........=....... ``` well that's awesome, but how did that `0xff` get there? well given we can see through time, that is not too hard of a question to answer ``` 0:000> dx -g @$cursession.TTD.Memory(0x0000029f8eb6d74f,0x0000029f8eb6d750, "w") ================================================================================================================================================================================================================================================================= = = (+) EventType = (+) ThreadId = (+) UniqueThreadId = (+) TimeStart = (+) SystemTimeStart = (+) TimeEnd = (+) SystemTimeEnd = (+) AccessType = (+) IP = (+) Address = (+) Size = (+) Value = ================================================================================================================================================================================================================================================================= = [0x0] - 0x1 - 0x5750 - 0x2 - 316748:78C - January 7, 2021 01:13:34.706 - 316748:78C - January 7, 2021 01:13:34.706 - Write - 0x7ff9aa9fc5ef - 0x29f8eb6d74c - 0x4 - 0x0 = = [0x1] - 0x1 - 0x5750 - 0x2 - 316748:78D - January 7, 2021 01:13:34.706 - 316748:78D - January 7, 2021 01:13:34.706 - Write - 0x7ff9aa9fc5f6 - 0x29f8eb6d74c - 0x4 - 0x5c00 = = [0x2] - 0x1 - 0x5750 - 0x2 - 316748:78F - January 7, 2021 01:13:34.706 - 316748:78F - January 7, 2021 01:13:34.706 - Write - 0x7ff9aa9fc5fd - 0x29f8eb6d74f - 0x1 - 0x80 = = [0x3] - 0x1 - 0x5750 - 0x2 - 318A06:11E - .. 00:00:00.0 - 318A06:11E - .. 00:00:00.0 - Write - 0x7ff9aaa07a47 - 0x29f8eb6d74f - 0x1 - 0xbf = = [0x4] - 0x1 - 0x5750 - 0x2 - 526B8F:2 - January 7, 2021 01:14:05.451 - 526B8F:2 - January 7, 2021 01:14:05.451 - Write - 0x7ff9aaa95ec3 - 0x29f8eb6d74f - 0x1 - 0x80 = = [0x5] - 0x1 - 0x5750 - 0x2 - 57E13F:ED0 - January 7, 2021 01:14:08.576 - 57E13F:ED0 - January 7, 2021 01:14:08.576 - Write - 0x7ff98e7c16e9 - 0x29f8eb6d74f - 0x1 - 0x0 = = [0x6] - 0x1 - 0x5750 - 0x2 - 57E13F:15FD - January 7, 2021 01:14:08.576 - 57E13F:15FD - January 7, 2021 01:14:08.576 - Write - 0x7ff98e7c12de - 0x29f8eb6d74f - 0x1 - 0xff = = [0x7] - 0x1 - 0x5750 - 0x2 - 57E147:5CC - January 7, 2021 01:14:08.576 - 57E147:5CC - January 7, 2021 01:14:08.576 - Write - 0x7ff98e7c16e9 - 0x29f8eb6d74f - 0x1 - 0x0 = = [0x8] - 0x1 - 0x5750 - 0x2 - 57E147:60E - January 7, 2021 01:14:08.576 - 57E147:60E - January 7, 2021 01:14:08.576 - Write - 0x7ff6bc88de90 - 0x29f8eb6d74c - 0x4 - 0xffffffff = = [0x9] - 0x1 - 0x5750 - 0x2 - 57E147:714 - January 7, 2021 01:14:08.576 - 57E147:714 - January 7, 2021 01:14:08.576 - Write - 0x7ff6bbf07588 - 0x29f8eb6d748 - 0x8 - 0xffffffffffffffff = ================================================================================================================================================================================================================================================================= ``` 9 writes in total, lets look at the stacks frame 0/1/2/3 : deg allocates and RtlpAllocateHeapInternal seemingly sets some housekeeping vars ``` 0:000> kb # RetAddr : Args to Child : Call Site 00 00007ff9`aaa07f5e : 00000000`00000000 0000029f`fafec820 0000029f`fafec820 00000000`000005d0 : ntdll!RtlpSubSegmentInitialize+0xed 01 00007ff9`aaa06e2c : 0000029f`e2290000 0000029f`00000000 00000000`00000590 0000029f`00000008 : ntdll!RtlpLowFragHeapAllocFromContext+0x90e 02 00007ff9`a813d65e : 00000000`00000000 00000000`00000590 00000000`00000000 00000000`00000000 : ntdll!RtlpAllocateHeapInternal+0x12c 03 00007ff6`bc8e4b7b : 00000000`00000001 0000029f`fed976b0 00000000`00000000 00007ff6`bc856a66 : ucrtbase!_calloc_base+0x4e 04 00007ff6`bc85e9d7 : 0000029f`8ea720f8 0000029f`8ea720f8 0000029f`f2a293c8 0000029f`fed976b0 : blender!MEM_lockfree_callocN+0x2b [K:\BlenderGit\blender\intern\guardedalloc\intern\mallocn_lockfree_impl.c @ 235] 05 00007ff6`bc84eadd : 0000029f`f2a25328 0000029f`f2a25328 00000000`00000000 00000000`00000000 : blender!blender::deg::IDNode::init_copy_on_write+0x57 [K:\BlenderGit\blender\source\blender\depsgraph\intern\node\deg_node_id.cc @ 115] 06 00007ff6`bc86cf81 : 00000000`00000000 0000029f`fed97688 00000000`00000000 00000000`00000001 : blender!blender::deg::Depsgraph::add_id_node+0x5d [K:\BlenderGit\blender\source\blender\depsgraph\intern\depsgraph.cc @ 128] 07 00007ff6`bc86f86a : 0000029f`fac0a270 0000029f`fed97688 0000029f`fac0a270 000000b8`70bff531 : blender!blender::deg::DepsgraphNodeBuilder::add_id_node+0x111 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\deg_builder_nodes.cc @ 169] 08 00007ff6`bc87a61b : 00000000`00000020 00007ff6`bc21714c 0000029f`fac0a270 00000000`00000000 : blender!blender::deg::DepsgraphNodeBuilder::build_object+0xda [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\deg_builder_nodes.cc @ 584] 09 00007ff6`bc869d76 : 000000b8`70bff6d0 000000b8`70bff6d0 0000029f`ff5d2588 0000029f`f2a25328 : blender!blender::deg::DepsgraphNodeBuilder::build_view_layer+0xab [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\deg_builder_nodes_view_layer.cc @ 118] 0a (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!blender::deg::AbstractBuilderPipeline::build_step_nodes+0x28 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\pipeline.cc @ 76] 0b 00007ff6`bc84fc09 : 0000029f`fac0a270 00000000`00000000 00000000`00000000 00007ff6`bc9d31b8 : blender!blender::deg::AbstractBuilderPipeline::build+0x56 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\pipeline.cc @ 55] 0c (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!DEG_graph_build_from_view_layer+0x17 [K:\BlenderGit\blender\source\blender\depsgraph\intern\depsgraph_build.cc @ 228] 0d 00007ff6`bbed5718 : 0000029f`f2a293c8 00007ff6`bbe9ab31 00000000`00000000 000000b8`70bff7a0 : blender!DEG_graph_relations_update+0x39 [K:\BlenderGit\blender\source\blender\depsgraph\intern\depsgraph_build.cc @ 281] 0e 00007ff6`bbe1ecb4 : 00000000`00000000 0000029f`e20c7048 0000029f`f2a25328 00000000`00000000 : blender!scene_graph_update_tagged+0xa8 [K:\BlenderGit\blender\source\blender\blenkernel\intern\scene.c @ 2607] 0f (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_depsgraph+0xf6 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 364] 10 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_refresh_wm_and_depsgraph+0x158 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 389] 11 00007ff6`bbe095c8 : 0000029f`e20c7048 000000b8`70bff9a8 00000000`00000000 0000029f`e20d90b0 : blender!wm_event_do_notifiers+0x644 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 571] 12 00007ff6`bbe051ef : 0000029f`e20c7048 00000000`00000000 00000000`00000000 00000000`00000001 : blender!WM_main+0x28 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm.c @ 641] 13 00007ff6`bc973554 : 00000000`00000001 00000000`00000000 0000029f`e20d90b0 00000000`00000000 : blender!main+0x39f [K:\BlenderGit\blender\source\creator\creator.c @ 527] 14 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!invoke_main+0x22 [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78] 15 00007ff9`a9d17034 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : blender!__scrt_common_main_seh+0x10c [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 16 00007ff9`aaa3d0d1 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14 17 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21 ``` frame 4 : DepsgraphNodeBuilder's dtor frees some ram, and the heap updates some of the house keeping flags, seems fair.. ``` 0:000> kb # RetAddr : Args to Child : Call Site 00 00007ff9`aaa05d21 : 00000000`00000000 0000029f`e20c0000 00000000`00000001 00000000`00000000 : ntdll!RtlpFreeHeapInternal+0x8cd63 01 00007ff9`a813e97b : 00000000`000002b3 0000029f`87d8e030 0000029f`87d8e030 0000029f`f292e770 : ntdll!RtlFreeHeap+0x51 02 00007ff6`bc86c5f1 : 00000000`000002b3 00007ff6`00000000 0000029f`f292e738 0000029f`8f1d1268 : ucrtbase!_free_base+0x1b 03 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!blender::deg::DepsgraphNodeBuilder::{dtor}+0x64 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\deg_builder_nodes.cc @ 147] 04 00007ff6`bc869d96 : 000000b8`70bff6d0 0000029f`8f1d1268 0000029f`f2a25328 0000029f`f2a25328 : blender!blender::deg::DepsgraphNodeBuilder::`scalar deleting destructor'+0x81 05 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!std::default_delete<blender::deg::DepsgraphNodeBuilder>::operator()+0xa [C:\Program Files (x86)\Microsoft Visual Studio\2019\Preview\VC\Tools\MSVC\14.28.29617\include\memory @ 3285] 06 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!std::unique_ptr<blender::deg::DepsgraphNodeBuilder,std::default_delete<blender::deg::DepsgraphNodeBuilder> >::{dtor}+0x14 [C:\Program Files (x86)\Microsoft Visual Studio\2019\Preview\VC\Tools\MSVC\14.28.29617\include\memory @ 3395] 07 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!blender::deg::AbstractBuilderPipeline::build_step_nodes+0x48 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\pipeline.cc @ 78] 08 00007ff6`bc84fc09 : 0000029f`fac09730 00000000`00000000 00000000`00000000 00007ff6`bc9d31b8 : blender!blender::deg::AbstractBuilderPipeline::build+0x76 [K:\BlenderGit\blender\source\blender\depsgraph\intern\builder\pipeline.cc @ 56] 09 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!DEG_graph_build_from_view_layer+0x17 [K:\BlenderGit\blender\source\blender\depsgraph\intern\depsgraph_build.cc @ 228] 0a 00007ff6`bbed5718 : 0000029f`8fb0dd98 00007ff6`bbe9ab31 00000000`00000000 000000b8`70bff7a0 : blender!DEG_graph_relations_update+0x39 [K:\BlenderGit\blender\source\blender\depsgraph\intern\depsgraph_build.cc @ 281] 0b 00007ff6`bbe1ecb4 : 00000000`00000000 0000029f`e20c7048 0000029f`f2a25328 00000000`00000000 : blender!scene_graph_update_tagged+0xa8 [K:\BlenderGit\blender\source\blender\blenkernel\intern\scene.c @ 2607] 0c (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_depsgraph+0xf6 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 364] 0d (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_refresh_wm_and_depsgraph+0x158 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 389] 0e 00007ff6`bbe095c8 : 0000029f`e20c7048 000000b8`70bff9a8 00000000`00000000 0000029f`e20d90b0 : blender!wm_event_do_notifiers+0x644 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 571] 0f 00007ff6`bbe051ef : 0000029f`e20c7048 00000000`00000000 00000000`00000000 00000000`00000001 : blender!WM_main+0x28 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm.c @ 641] 10 00007ff6`bc973554 : 00000000`00000001 00000000`00000000 0000029f`e20d90b0 00000000`00000000 : blender!main+0x39f [K:\BlenderGit\blender\source\creator\creator.c @ 527] 11 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!invoke_main+0x22 [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78] 12 00007ff9`a9d17034 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : blender!__scrt_common_main_seh+0x10c [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 13 00007ff9`aaa3d0d1 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14 14 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21 ``` frame 5/6/7/8/9: we copy some data to it? which is odd, since this was clearly in an internal house keeping area of the heap, not an area we ought to be writing to, it writes to this address multiple times in various stages of BKE_id_copy_ex this is the stack from frame 9 but all originate from `BKE_id_copy_ex` ``` 0:000> kb # RetAddr : Args to Child : Call Site 00 00007ff6`bbf069a5 : 0000029f`f2a25328 0000029f`8f2e5318 00000000`00000000 00000000`00000000 : blender!CustomData_update_typemap+0xd8 [K:\BlenderGit\blender\source\blender\blenkernel\intern\customdata.c @ 2073] 01 00007ff6`bbf4bf99 : 000000b8`70bfd830 0000029f`8eb6d688 00078208`040b1c00 0000029f`8eb6d188 : blender!CustomData_merge+0x235 [K:\BlenderGit\blender\source\blender\blenkernel\intern\customdata.c @ 2171] 02 00007ff6`bbecb8ac : 0000029f`8eb6d188 00007ff6`bd408cd0 000000b8`70bfd2f8 000000b8`70bfd330 : blender!mesh_copy_data+0x139 [K:\BlenderGit\blender\source\blender\blenkernel\intern\mesh.c @ 132] 03 00007ff6`bc85c8aa : 0000029f`8eb6d188 0000029f`8eb6d188 0000029f`8f2e5318 00000000`00000000 : blender!BKE_id_copy_ex+0xbc [K:\BlenderGit\blender\source\blender\blenkernel\intern\lib_id.c @ 605] 04 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!blender::deg::?A0xef58984b::id_copy_inplace_no_main+0x47 [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval_copy_on_write.cc @ 304] 05 00007ff6`bc85cc0a : 0000029f`f2a25328 0000029f`8f2e5318 00000000`00000000 00000000`00000000 : blender!blender::deg::deg_expand_copy_on_write_datablock+0x26a [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval_copy_on_write.cc @ 899] 06 00007ff6`bc85c629 : 00000000`0000424f 0000029f`8eda80b0 0000029f`8f2e5318 0000029f`8eda8f78 : blender!blender::deg::deg_update_copy_on_write_datablock+0x9a [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval_copy_on_write.cc @ 954] 07 00007ff6`bc86aff4 : 0000029f`8eda8f78 00000000`00000000 00000000`00000000 00007ff6`bc8e4e34 : blender!blender::deg::deg_evaluate_copy_on_write+0x49 [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval_copy_on_write.cc @ 1089] 08 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!std::_Func_class<void,Depsgraph *>::operator()+0x1c [C:\Program Files (x86)\Microsoft Visual Studio\2019\Preview\VC\Tools\MSVC\14.28.29617\include\functional @ 986] 09 00007ff6`bc86af48 : 0000029f`f2a25328 000000b8`70bff760 0000029f`f2a25328 0000029f`901aa578 : blender!blender::deg::`anonymous namespace'::evaluate_node+0x84 [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval.cc @ 115] 0a 00007ff6`bc8bd797 : 0000029f`8fbd9d20 00000000`00000000 000000b8`70bff7b8 00007ff6`bc8bd4ea : blender!blender::deg::`anonymous namespace'::deg_task_run_func+0x28 [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval.cc @ 127] 0b (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!Task::operator()+0xa [K:\BlenderGit\blender\source\blender\blenlib\intern\task_pool.cc @ 120] 0c (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!tbb_task_pool_run+0x55 [K:\BlenderGit\blender\source\blender\blenlib\intern\task_pool.cc @ 226] 0d (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!tbb_task_pool_work_and_wait+0x90 [K:\BlenderGit\blender\source\blender\blenlib\intern\task_pool.cc @ 239] 0e 00007ff6`bc86ac7a : 0000029f`901aa578 00000000`00000000 000000b8`70bff7b8 0000029f`f2a25328 : blender!BLI_task_pool_work_and_wait+0xd7 [K:\BlenderGit\blender\source\blender\blenlib\intern\task_pool.cc @ 499] 0f 00007ff6`bbed5841 : 0000029f`8fb0dd98 00007ff6`bbe9ab31 00000000`00000000 000000b8`70bff7a0 : blender!blender::deg::deg_evaluate_on_refresh+0x1ca [K:\BlenderGit\blender\source\blender\depsgraph\intern\eval\deg_eval.cc @ 391] 10 00007ff6`bbe1ecb4 : 00000000`00000000 0000029f`e20c7048 0000029f`f2a25328 00000000`00000000 : blender!scene_graph_update_tagged+0x1d1 [K:\BlenderGit\blender\source\blender\blenkernel\intern\scene.c @ 2612] 11 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_depsgraph+0xf6 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 364] 12 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!wm_event_do_refresh_wm_and_depsgraph+0x158 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 389] 13 00007ff6`bbe095c8 : 0000029f`e20c7048 000000b8`70bff9a8 00000000`00000000 0000029f`e20d90b0 : blender!wm_event_do_notifiers+0x644 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm_event_system.c @ 571] 14 00007ff6`bbe051ef : 0000029f`e20c7048 00000000`00000000 00000000`00000000 00000000`00000001 : blender!WM_main+0x28 [K:\BlenderGit\blender\source\blender\windowmanager\intern\wm.c @ 641] 15 00007ff6`bc973554 : 00000000`00000001 00000000`00000000 0000029f`e20d90b0 00000000`00000000 : blender!main+0x39f [K:\BlenderGit\blender\source\creator\creator.c @ 527] 16 (Inline Function) : --------`-------- --------`-------- --------`-------- --------`-------- : blender!invoke_main+0x22 [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78] 17 00007ff9`a9d17034 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : blender!__scrt_common_main_seh+0x10c [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 18 00007ff9`aaa3d0d1 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14 19 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21``` So I think it's safe to say the DEG "sometimes" touches ram it shouldn't be touching, but i can't say i have any more insights into why it is doing that or why the regular tools like asan or the page heap make the problem go away... @sergey any clever ideas how to tackle this one?
Collaborator

Alight now that the "where" is known (BKE_id_copy_ex), drawing the issue out of the shadows is rather straight forward,

  1. apply patch below
  2. reproduce with .blend from opening post
  3. 000001E46BFF37C8 allocated 1416 clearing 1792
  4. thats..not..good..

The only real mystery is why asan is not picking up on this, I'm somewhat out of my depth with this ID Management code, so I'll leave fixing the bug for someone else

diff --git a/source/blender/blenkernel/intern/lib_id.c b/source/blender/blenkernel/intern/lib_id.c
index be7ce34f7e6..c2970d09dd8 100644
--- a/source/blender/blenkernel/intern/lib_id.c
+++ b/source/blender/blenkernel/intern/lib_id.c
@@ -570,6 +570,10 @@ ID *BKE_id_copy_ex(Main *bmain, const ID *id, ID **r_newid, const int flag)
     if (newid != NULL) {
       /* Allow some garbage non-initialized memory to go in, and clean it up here. */
       const size_t size = BKE_libblock_get_alloc_info(GS(id->name), NULL);
+      const size_t block_size = ((size_t *)newid)[-1]; // magic, don't ask :)
+      if (size > block_size ) {
+        printf("%p allocated %d clearing %d\n", newid, (int)block_size, (int)size);
+      }
       memset(newid, 0, size);
     }
   }
Alight now that the "where" is known (`BKE_id_copy_ex`), drawing the issue out of the shadows is rather straight forward, 1) apply patch below 2) reproduce with .blend from opening post 3) `000001E46BFF37C8 allocated 1416 clearing 1792` 4) thats..not..good.. The only real mystery is why asan is not picking up on this, I'm somewhat out of my depth with this ID Management code, so I'll leave fixing the bug for someone else ``` diff --git a/source/blender/blenkernel/intern/lib_id.c b/source/blender/blenkernel/intern/lib_id.c index be7ce34f7e6..c2970d09dd8 100644 --- a/source/blender/blenkernel/intern/lib_id.c +++ b/source/blender/blenkernel/intern/lib_id.c @@ -570,6 +570,10 @@ ID *BKE_id_copy_ex(Main *bmain, const ID *id, ID **r_newid, const int flag) if (newid != NULL) { /* Allow some garbage non-initialized memory to go in, and clean it up here. */ const size_t size = BKE_libblock_get_alloc_info(GS(id->name), NULL); + const size_t block_size = ((size_t *)newid)[-1]; // magic, don't ask :) + if (size > block_size ) { + printf("%p allocated %d clearing %d\n", newid, (int)block_size, (int)size); + } memset(newid, 0, size); } } ```
Owner

Added subscriber: @ideasman42

Added subscriber: @ideasman42
Owner

Looked into this bug, the problem is caused by the depsgraph using a Map (DepsgraphNodeBuilder.id_info_hash_) keeping a map of ID to IDInfo data between undo steps.

On redo, new ID's are freed and created, never updating id_info_hash_.

When creating many ID's, an Object ID (in my case) is getting allocated at the address previously used for a mesh, causing the IDNode.id_cow to copy mesh data into an object pointer (hence the buffer overrun in BKE_id_copy_ex).


We could try fix this by updating the depsgraphs runtime data to keep it valid, however I'm not sure if this is worth doing. It's already being rebuilt when adding/deleting objects for example.

A simpler solution could be to keep the optimization as-is, but do a full rebuild if undo adds/removes ID data-blocks. So stale ID data never gets used.

This is some quick-hack patch that does this - for reference: P1872

Looked into this bug, the problem is caused by the depsgraph using a `Map` (`DepsgraphNodeBuilder.id_info_hash_`) keeping a map of `ID` to `IDInfo` data between undo steps. On redo, new ID's are freed and created, never updating `id_info_hash_`. When creating many ID's, an Object ID (in my case) is getting allocated at the address previously used for a mesh, causing the `IDNode.id_cow` to copy mesh data into an object pointer (hence the buffer overrun in `BKE_id_copy_ex`). ---- We could try fix this by updating the depsgraphs runtime data to keep it valid, however I'm not sure if this is worth doing. It's already being rebuilt when adding/deleting objects for example. A simpler solution could be to keep the optimization as-is, but do a full rebuild if undo adds/removes ID data-blocks. So stale ID data never gets used. This is some quick-hack patch that does this - for reference: [P1872](https://archive.blender.org/developer/P1872.txt)
Collaborator

Nice find! That fixes the issue for me as well.

Nice find! That fixes the issue for me as well.
Collaborator

Great work everybody!

Runtime ID pointers over undo/redo are a reoccurring issue... We have ID.session_uuid now and it would be trivial to solve such issues if we somehow registered a session_uuid->ID * map that gets updated on destructive main changes (undo, redo, deletion, file reading, ...). Instead of storing a pointer, you'd store the session_uuid and query the pointer if needed. There's BKE_main_idmap already, but these don't get updated on main changes.
Such a registry could be quite expensive in big files if it contained all IDs. We could lazy create the entries, so e.g. when the depsgraph adds a new ID to id_info_hash_ it could ensure the ID is registered in the global map (would still create many entries though). Ideally you could do O(1) lookups by session_uuid right within Main but that's not a simple change.

Just throwing this idea out there, it would make a number of things easier.

Great work everybody! Runtime ID pointers over undo/redo are a reoccurring issue... We have `ID.session_uuid` now and it would be trivial to solve such issues if we somehow registered a `session_uuid`->`ID *` map that gets updated on destructive main changes (undo, redo, deletion, file reading, ...). Instead of storing a pointer, you'd store the `session_uuid` and query the pointer if needed. There's `BKE_main_idmap` already, but these don't get updated on main changes. Such a registry could be quite expensive in big files if it contained all IDs. We could lazy create the entries, so e.g. when the depsgraph adds a new ID to `id_info_hash_` it could ensure the ID is registered in the global map (would still create many entries though). Ideally you could do O(1) lookups by `session_uuid` right within `Main` but that's not a simple change. Just throwing this idea out there, it would make a number of things easier.
Collaborator

In #84397#1089432, @ideasman42 wrote:
When creating many ID's, an Object ID (in my case) is getting allocated at the address previously used for a mesh, causing the IDNode.id_cow to copy mesh data into an object pointer (hence the buffer overrun in BKE_id_copy_ex).

neither the paged heap nor asan re-use memory (impossible to detect after use issues otherwise) so that would explain why the issue doesn't show with those tools, thanks for the explanation! that part was bugging me much more than i'd like to admit :)

> In #84397#1089432, @ideasman42 wrote: > When creating many ID's, an Object ID (in my case) is getting allocated at the address previously used for a mesh, causing the `IDNode.id_cow` to copy mesh data into an object pointer (hence the buffer overrun in `BKE_id_copy_ex`). neither the paged heap nor asan re-use memory (impossible to detect after use issues otherwise) so that would explain why the issue doesn't show with those tools, thanks for the explanation! that part was bugging me much more than i'd like to admit :)
brecht commented 2 years ago
Owner

Added subscriber: @brecht

Added subscriber: @brecht
brecht commented 2 years ago
Owner

@JulianEisel, I expect the depsgraph can use session UUIDs as key values for id_info_hash_ and any similar maps directly, without the need to maintain an additional map.

@JulianEisel, I expect the depsgraph can use session UUIDs as key values for `id_info_hash_` and any similar maps directly, without the need to maintain an additional map.
mont29 commented 2 years ago
Owner

Added subscriber: @mont29

Added subscriber: @mont29
mont29 commented 2 years ago
Owner

I will also first try to make depsgraph use those session uuids first, this looks like the most obvious solution indeed.

I will also first try to make depsgraph use those session uuids first, this looks like the most obvious solution indeed.
Collaborator

In #84397#1090855, @brecht wrote:
@JulianEisel, I expect the depsgraph can use session UUIDs as key values for id_info_hash_ and any similar maps directly, without the need to maintain an additional map.

What happens on ID deletion, do we rebuild the depsgraph or remove the nodes to be deleted? If it's the latter I guess the ID should be removed from id_info_hash_?
I assumed the IDInfo.id_cow could be an issue over undos, but didn't look into it much and trust your judgement there.
Anyway, if using session_uuid solves the issue: yay!

> In #84397#1090855, @brecht wrote: > @JulianEisel, I expect the depsgraph can use session UUIDs as key values for `id_info_hash_` and any similar maps directly, without the need to maintain an additional map. What happens on ID deletion, do we rebuild the depsgraph or remove the nodes to be deleted? If it's the latter I guess the ID should be removed from `id_info_hash_`? I assumed the `IDInfo.id_cow` could be an issue over undos, but didn't look into it much and trust your judgement there. Anyway, if using `session_uuid` solves the issue: yay!
brecht commented 2 years ago
Owner

The dependency graph is fully rebuilt on changes. Evaluated datablocks are either reused, or discarded if they end up unused after the rebuild.

The dependency graph is fully rebuilt on changes. Evaluated datablocks are either reused, or discarded if they end up unused after the rebuild.
mont29 commented 2 years ago
Owner

Can people able to reproduce the issue confirm if D10077: Fix #84397, #80203: use session_uuid instead of ID pointers in depsgraph storage. fix it for them? thanks.

Can people able to reproduce the issue confirm if [D10077: Fix #84397, #80203: use `session_uuid` instead of ID pointers in depsgraph storage.](https://archive.blender.org/developer/D10077) fix it for them? thanks.
Collaborator

It does not fix it for me unfortunately. I still get the same error.

It does *not* fix it for me unfortunately. I still get the same error.
Collaborator

also not fixed here, however the crash moved to a different location, I attached a stack trace in D10077

also not fixed here, however the crash moved to a different location, I attached a stack trace in [D10077](https://archive.blender.org/developer/D10077)
Sergey commented 2 years ago
Owner

Even if the D10077 does not solve this crash, I think it's good to wrap it up a bit, and commit anyway. It is proper thing to do. See my comment there.

To eliminate possibility of "stale" pointers used in the depsgraph you can replace IDInfo *id_info = id_info_hash_.lookup_default(id, nullptr); with IDInfo *id_info = nullptr.

From Campbell's comment sounds like there is some confusion of id_info_hash_. This hash is only used during depsgraph relations update, to "transfer" evaluated state of IDs from old depsgraph to the new one. It is not possible to "update" id_info_hash_, as this is a temporary storage for during relations update.

I'm not sure this is a root of the problem though, because neither proper use of session_uuid for the id_info_hash_ nor complete ignoring of evaluated state transfer fixes crash to me. But the crash is different for me: BKE_scene_object_base_flag_sync_from_base has a base which object is nullptr.

Even if the [D10077](https://archive.blender.org/developer/D10077) does not solve this crash, I think it's good to wrap it up a bit, and commit anyway. It is proper thing to do. See my comment there. To eliminate possibility of "stale" pointers used in the depsgraph you can replace `IDInfo *id_info = id_info_hash_.lookup_default(id, nullptr);` with `IDInfo *id_info = nullptr`. From Campbell's comment sounds like there is some confusion of `id_info_hash_`. This hash is only used during depsgraph relations update, to "transfer" evaluated state of IDs from old depsgraph to the new one. It is not possible to "update" `id_info_hash_`, as this is a temporary storage for during relations update. I'm not sure this is a root of the problem though, because neither proper use of `session_uuid` for the `id_info_hash_` nor complete ignoring of evaluated state transfer fixes crash to me. But the crash is different for me: `BKE_scene_object_base_flag_sync_from_base` has a base which object is `nullptr`.
Collaborator

But the crash is different for me: BKE_scene_object_base_flag_sync_from_base has a base which object is nullptr.

That's the same crash is i get from P1870 which appeared (to me) to be a different problem than the one in the opening post,

I can confirm IDInfo *id_info = nullptr fixes the repro in the opening post but not P1870 , I wasn't convinced P1870 wasn't my fault by writing bad python, so I had not pushed the issue very hard

> But the crash is different for me: BKE_scene_object_base_flag_sync_from_base has a base which object is nullptr. That's the same crash is i get from [P1870](https://archive.blender.org/developer/P1870.txt) which appeared (to me) to be a different problem than the one in the opening post, I can confirm `IDInfo *id_info = nullptr` fixes the repro in the opening post but not [P1870](https://archive.blender.org/developer/P1870.txt) , I wasn't convinced [P1870](https://archive.blender.org/developer/P1870.txt) wasn't my fault by writing bad python, so I had not pushed the issue very hard
Sergey commented 2 years ago
Owner

@LazyDodo, ah ok, good to know. Can you test whether P1884 fixes the original issue?

@LazyDodo, ah ok, good to know. Can you test whether [P1884](https://archive.blender.org/developer/P1884.txt) fixes the original issue?
Collaborator

That hits the same crash in D10077, in void DepsgraphNodeBuilder::begin_build()

void DepsgraphNodeBuilder::begin_build()
{
    // ---8<--[cut unrelated code, it's there just not in this paste]--8<---
    id_info_hash_.add_new(id_node->id_orig->session_uuid, id_info); //<---`id_node->id_orig` has a bogus pointer at this point
    id_node->id_cow = nullptr;
  }
>	blender.exe!blender::deg::DepsgraphNodeBuilder::begin_build() Line 341	C++

 	[Inline Frame] blender.exe!blender::deg::AbstractBuilderPipeline::build_step_nodes() Line 75	C++
 	blender.exe!blender::deg::AbstractBuilderPipeline::build() Line 55	C++
 	[Inline Frame] blender.exe!DEG_graph_build_from_view_layer(Depsgraph *) Line 228	C++
 	blender.exe!DEG_graph_relations_update(Depsgraph * graph) Line 281	C++
 	blender.exe!scene_graph_update_tagged(Depsgraph * depsgraph, Main * bmain, bool only_if_tagged) Line 2622	C
 	[Inline Frame] blender.exe!wm_event_do_depsgraph(bContext *) Line 364	C
 	[Inline Frame] blender.exe!wm_event_do_refresh_wm_and_depsgraph(bContext *) Line 389	C
 	blender.exe!wm_event_do_notifiers(bContext * C) Line 571	C
 	blender.exe!WM_main(bContext * C) Line 641	C
 	blender.exe!main(int argc, const unsigned char * * UNUSED_argv_c) Line 527	C
 	[External Code]	

Given this even hits with the page heap, i'm pretty hopeful asan will catch it on linux and may shed some light on why the pointer is bogus

That hits the same crash in [D10077](https://archive.blender.org/developer/D10077), in `void DepsgraphNodeBuilder::begin_build()` ``` void DepsgraphNodeBuilder::begin_build() { // ---8<--[cut unrelated code, it's there just not in this paste]--8<--- id_info_hash_.add_new(id_node->id_orig->session_uuid, id_info); //<---`id_node->id_orig` has a bogus pointer at this point id_node->id_cow = nullptr; } ``` ``` > blender.exe!blender::deg::DepsgraphNodeBuilder::begin_build() Line 341 C++ [Inline Frame] blender.exe!blender::deg::AbstractBuilderPipeline::build_step_nodes() Line 75 C++ blender.exe!blender::deg::AbstractBuilderPipeline::build() Line 55 C++ [Inline Frame] blender.exe!DEG_graph_build_from_view_layer(Depsgraph *) Line 228 C++ blender.exe!DEG_graph_relations_update(Depsgraph * graph) Line 281 C++ blender.exe!scene_graph_update_tagged(Depsgraph * depsgraph, Main * bmain, bool only_if_tagged) Line 2622 C [Inline Frame] blender.exe!wm_event_do_depsgraph(bContext *) Line 364 C [Inline Frame] blender.exe!wm_event_do_refresh_wm_and_depsgraph(bContext *) Line 389 C blender.exe!wm_event_do_notifiers(bContext * C) Line 571 C blender.exe!WM_main(bContext * C) Line 641 C blender.exe!main(int argc, const unsigned char * * UNUSED_argv_c) Line 527 C [External Code] ``` Given this even hits with the page heap, i'm pretty hopeful asan will catch it on linux and may shed some light on why the pointer is bogus
Sergey commented 2 years ago
Owner

@LazyDodo, ok, managed to crash. Is trivial, actually: do not de-reference id_orig, store the uuid in the IDNode. Mind checking P1886 ?

@LazyDodo, ok, managed to crash. Is trivial, actually: do not de-reference `id_orig`, store the uuid in the `IDNode`. Mind checking [P1886](https://archive.blender.org/developer/P1886.txt) ?
Poster

@Sergey I tried it with P1886 on master and it works! My crash is gone. Thank you alot!

@Sergey I tried it with [P1886](https://archive.blender.org/developer/P1886.txt) on master and it works! My crash is gone. Thank you alot!
Collaborator

can confirm P1886 fixes the opening post, but not the crash inside BKE_scene_object_base_flag_sync_from_base (P1870) given how "muddy" this ticket is already that should perhaps move to its own ticket?

can confirm [P1886](https://archive.blender.org/developer/P1886.txt) fixes the opening post, but not the crash inside `BKE_scene_object_base_flag_sync_from_base` ([P1870](https://archive.blender.org/developer/P1870.txt)) given how "muddy" this ticket is already that should perhaps move to its own ticket?
Collaborator

This issue was referenced by 96336007e9

This issue was referenced by 96336007e9bb55de4b065c89cec3e335b0d2b73a
Collaborator

This issue was referenced by f6c7da5759

This issue was referenced by f6c7da575987a85e25571163a25dde659e1d56e0
Collaborator

This issue was referenced by abbc43e4e4

This issue was referenced by abbc43e4e419c44e6d0134aec051d543a0944b3e
Sergey commented 2 years ago
Owner

Changed status from 'Confirmed' to: 'Resolved'

Changed status from 'Confirmed' to: 'Resolved'
Sergey closed this issue 2 years ago
Sergey self-assigned this 2 years ago
mont29 commented 2 years ago
Owner

@LazyDodo yes please report that BKE_scene_object_base_flag_sync_from_base issue in a new task. :)

@LazyDodo yes please report that `BKE_scene_object_base_flag_sync_from_base` issue in a new task. :)

Added subscriber: @rayiik-1

Added subscriber: @rayiik-1

just want to pass this on as it still appears to be an issue in 2.92 so i wanted to give you guys some crash/debug files in hopes that it might help in this case its 100% reproducible every time regardless of how long ive been in blender even fresh start, but this time i create an object using bpy.ops.mesh.primitive_uv_sphere_add (4 or 5) however does not seem to happen with non primitave object creation and deletion

hope this help guys and thanks for all the great hard work. if you need any more tests run shoot me msg.

3blender.crash.txt

blender_debug_output1.txt

blender.crash.txt

blender_system_info1.txt

blender_debug_output1.txt

blender_system_info.txt

blender3.crash.txt

blender4.crash.txt

blender5.crash.txt

just want to pass this on as it still appears to be an issue in 2.92 so i wanted to give you guys some crash/debug files in hopes that it might help in this case its 100% reproducible every time regardless of how long ive been in blender even fresh start, but this time i create an object using bpy.ops.mesh.primitive_uv_sphere_add (4 or 5) however does not seem to happen with non primitave object creation and deletion hope this help guys and thanks for all the great hard work. if you need any more tests run shoot me msg. [3blender.crash.txt](https://archive.blender.org/developer/F9893199/3blender.crash.txt) [blender_debug_output1.txt](https://archive.blender.org/developer/F9893198/blender_debug_output1.txt) [blender.crash.txt](https://archive.blender.org/developer/F9893200/blender.crash.txt) [blender_system_info1.txt](https://archive.blender.org/developer/F9893201/blender_system_info1.txt) [blender_debug_output1.txt](https://archive.blender.org/developer/F9893202/blender_debug_output1.txt) [blender_system_info.txt](https://archive.blender.org/developer/F9893203/blender_system_info.txt) [blender3.crash.txt](https://archive.blender.org/developer/F9893195/blender3.crash.txt) [blender4.crash.txt](https://archive.blender.org/developer/F9893196/blender4.crash.txt) [blender5.crash.txt](https://archive.blender.org/developer/F9893197/blender5.crash.txt)
rjg commented 2 years ago
Collaborator

@rayiik-1 Please create a new bug report through Help > Report a Bug in Blender and add the precise steps that lead to the crash.

@rayiik-1 Please create a new bug report through *Help > Report a Bug* in Blender and add the precise steps that lead to the crash.
ThomasDinges added this to the 2.92 milestone 2 days ago
Sign in to join this conversation.
No Label
Interest/Alembic
Interest/Animation & Rigging
Interest/Asset Browser
Interest/Asset Browser Project Overview
Interest/Audio
Interest/Automated Testing
Interest/Blender Asset Bundle
Interest/Collada
Interest/Compositing
Interest/Core
Interest/Cycles
Interest/Dependency Graph
Interest/Development Management
Interest/Eevee & Viewport
Interest/Freestyle
Interest/Geometry Nodes
Interest/Grease Pencil
Interest/ID Management
Interest/Images & Movies
Interest/Import/Export
Interest/Line Art
Interest/Masking
Interest/Modeling
Interest/Modifiers
Interest/Motion Tracking
Interest/Nodes & Physics
Interest/Overrides
Interest/Performance
Interest/Performance
Interest/Physics
Interest/Pipeline, Assets & I/O
Interest/Platforms, Builds, Tests & Devices
Interest/Python API
Interest/Render & Cycles
Interest/Render Pipeline
Interest/Sculpt, Paint & Texture
Interest/Text Editor
Interest/Translations
Interest/Triaging
Interest/Undo
Interest/USD
Interest/User Interface
Interest/UV Editing
Interest/VFX & Video
Interest/Video Sequencer
Interest/Virtual Reality
legacy module/Animation & Rigging
legacy module/Core
legacy module/Development Management
legacy module/Eevee & Viewport
legacy module/Grease Pencil
legacy module/Modeling
legacy module/Nodes & Physics
legacy module/Pipeline, Assets & IO
legacy module/Platforms, Builds, Tests & Devices
legacy module/Python API
legacy module/Rendering & Cycles
legacy module/Sculpt, Paint & Texture
legacy module/Triaging
legacy module/User Interface
legacy module/VFX & Video
legacy project/1.0.0-beta.2
legacy project/Asset Browser (Archived)
legacy project/BF Blender: 2.8
legacy project/BF Blender: After Release
legacy project/BF Blender: Next
legacy project/BF Blender: Regressions
legacy project/BF Blender: Unconfirmed
legacy project/Blender 2.70
legacy project/Code Quest
legacy project/Datablocks and Libraries
legacy project/Eevee
legacy project/Game Animation
legacy project/Game Audio
legacy project/Game Data Conversion
legacy project/Game Engine
legacy project/Game Logic
legacy project/Game Physics
legacy project/Game Python
legacy project/Game Rendering
legacy project/Game UI
legacy project/GPU / Viewport
legacy project/GSoC
legacy project/Infrastructure: Websites
legacy project/LibOverrides - Usability and UX
legacy project/Milestone 1: Basic, Local Asset Browser
legacy project/Nodes
legacy project/OpenGL Error
legacy project/Papercut
legacy project/Pose Library Basics
legacy project/Retrospective
legacy project/Tracker Curfew
legacy project/Wintab High Frequency
Meta/Good First Issue
Meta/Papercut
migration/requires-manual-verification
Module › Animation & Rigging
Module › Core
Module › Development Management
Module › Eevee & Viewport
Module › Grease Pencil
Module › Modeling
Module › Nodes & Physics
Module › Pipeline, Assets & IO
Module › Platforms, Builds, Tests & Devices
Module › Python API
Module › Render & Cycles
Module › Sculpt, Paint & Texture
Module › Triaging
Module › User Interface
Module › VFX & Video
Platform/FreeBSD
Platform/Linux
Platform/macOS
Platform/Windows
Priority › High
Priority › Low
Priority › Normal
Priority › Unbreak Now!
Status › Archived
Status › Confirmed
Status › Duplicate
Status › Needs Information from Developers
Status › Needs Information from User
Status › Needs Triage
Status › Resolved
Type › Bug
Type › Design
Type › Known Issue
Type › Patch
Type › Report
Type › To Do
No Milestone
No project
No Assignees
12 Participants
Notifications
Due Date

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#84397
Loading…
There is no content yet.