Fix #104915: Race condition writing subsurf optimal display edges #105156

Merged
Member

Writing to a bitmap from multiple threads causes races when writing to
bits within the same integer. Instead, write to a separate boolean array
while subdividing, and move that to the final mesh bit vector after.

Notes:

  • The final copy to the bit vector could be replaced by a generic
    copy_from(Span<bool>) call in the future.
  • Theoretically we could entirely replace the BitVector with an
    Array<bool>, but 1/8 the memory use for edges is likely worth it.
Writing to a bitmap from multiple threads causes races when writing to bits within the same integer. Instead, write to a separate boolean array while subdividing, and move that to the final mesh bit vector after. Notes: - The final copy to the bit vector could be replaced by a generic `copy_from(Span<bool>)` call in the future. - Theoretically we could entirely replace the `BitVector` with an `Array<bool>`, but 1/8 the memory use for edges is likely worth it.
Hans Goudey added this to the 3.5 milestone 2023-02-23 22:51:48 +01:00
Hans Goudey added the
Module
Modeling
label 2023-02-23 22:51:48 +01:00
Hans Goudey added 1 commit 2023-02-23 22:52:01 +01:00
de78a588af Fix #104915: Race condition writing subsurf optimal display edges
Writing to a bitmap from multiple threads causes races when writing to
bits within the same integer. Instead, write to a separate boolean array
while subdividing, and move that to the final mesh bit vector after.

Notes:
 - The final copy to the bit vector could be replaced by a generic
   `copy_from(Span<bool>)` call in the future.
 - Theoretically we could entirely replace the `BitVector` with an
   `Array<bool>`, but 1/8 the memory use for edges is likely worth it.
Hans Goudey added this to the Modeling project 2023-02-23 22:52:50 +01:00
Hans Goudey requested review from Jeroen Bakker 2023-02-23 22:53:08 +01:00
Hans Goudey requested review from Sergey Sharybin 2023-02-23 22:53:08 +01:00
Hans Goudey requested review from Jacques Lucke 2023-02-23 22:53:21 +01:00
Author
Member

A few reviewers here since I'm using the new Bit map classes and want Jacques to see this, and Sergey because subdiv is his area, and Jeroen because he investigated the bug already. I don't think you all need to look into this, but I want to make it an option at least. Thanks.

A few reviewers here since I'm using the new Bit map classes and want Jacques to see this, and Sergey because subdiv is his area, and Jeroen because he investigated the bug already. I don't think you all need to look into this, but I want to make it an option at least. Thanks.
Member

Just quickly looked over it. Some thoughts:
Normally an blender::Array<bool> stores a a bool in an uint32_t. Stdlib does some trickery here, but not sure that we also added that. If this is the case this solution just over-allocates memory. It just feels like it makes the race condition less likely to happen.

Considerations:

  • L2 cache isn't shared between the CPU cores (fe. NUMA). Write back to L3/Ram will happen with cache lines.
  • Group data that all bits of a cache line is handled by the same core.
  • Use work stealing to ensure less faults. (not sure how though)
  • Using subdiv only a known fraction of the edges will have this state set. Store the indexes in a per thread list and combine later.
Just quickly looked over it. Some thoughts: Normally an `blender::Array<bool>` stores a a bool in an uint32_t. Stdlib does some trickery here, but not sure that we also added that. If this is the case this solution just over-allocates memory. It just feels like it makes the race condition less likely to happen. Considerations: - L2 cache isn't shared between the CPU cores (fe. NUMA). Write back to L3/Ram will happen with cache lines. - Group data that all bits of a cache line is handled by the same core. - Use work stealing to ensure less faults. (not sure how though) - Using subdiv only a known fraction of the edges will have this state set. Store the indexes in a per thread list and combine later.

@Jeroen-Bakker Isn't it important for the race condition what type of data is read/written, and not how the size is rounded during allocation (uint8_t or uint32_t)?

@Jeroen-Bakker Isn't it important for the race condition what type of data is read/written, and not how the size is rounded during allocation (`uint8_t` or `uint32_t`)?

sizeof(bool) is 1 on any platform/compiler we support, and blender::Array stores it as such. Multiple threads writing to different individual bytes in an array is safe, as they are at separate memory locations and the C++ standard guarantees that works.

Performance may not be great, but there should be no race condition.

`sizeof(bool)` is 1 on any platform/compiler we support, and `blender::Array` stores it as such. Multiple threads writing to different individual bytes in an array is safe, as they are at separate memory locations and the C++ standard guarantees that works. Performance may not be great, but there should be no race condition.
Sergey Sharybin approved these changes 2023-02-24 09:25:41 +01:00
Jeroen Bakker approved these changes 2023-02-24 10:06:53 +01:00
Jacques Lucke approved these changes 2023-02-24 14:31:51 +01:00
Hans Goudey merged commit 3db246a3ce into blender-v3.5-release 2023-02-26 23:59:13 +01:00
Hans Goudey deleted branch fix-subdiv-optimal-edges-threadsafety 2023-02-26 23:59:16 +01:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
6 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#105156
No description provided.