Segfault on Linux when running third party Python library function with multithreading enabled #99900
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
8 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#99900
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
System Information
Operating system: Gentoo Linux
Graphics card: 25:00.0 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] (rev a1)
Blender Version
Broken: (example: 2.80,
edbf15d3c0
, master, 2018-11-28, as found on the splash screen)a02992f131
https://builder.blender.org/download/daily/archive/blender-3.3.0-alpha+master.a02992f13138-linux.x86_64-release.tar.xzWorked: (newest version of Blender that worked as expected)
c0e4532331
https://builder.blender.org/download/daily/archive/blender-3.3.0-alpha+master.c0e453233132-linux.x86_64-release.tar.xzShort description of error
I'm the primary developer of the BlenderBIM Add-on (https://blenderbim.org/) which provides import / export support for the
.ifc
format for architects and engineers in Blender. The core functionality of the add-on revolves around the "IfcOpenShell" C++ library with Python bindings. One of the functions of IfcOpenShell is to process geometry with OpenCascade with multithreading, and return back verts / edges / faces so users can load geometry into Blender.This functionality has worked for a very long time since Blender 2.5 but with recent builds it has started segfaulting. The IfcOpenShell library I am loading is unchanged.
I understand this is not core Blender functionality, so I apologise if this is the wrong channel, but maybe it is a symptom of other issues in Blender. There are also lots of users of the add-on who depend on this (there is no other way to load building data into Blender) so I hope I have helped unearth an issue.
This is the segfault I get, I'm not sure how useful this backtrace is though.
It works on
c0e4532331
and segfaults ina02992f131
so I assume a change made between those 12 hours is the cause of the issue. Hope that narrows it down.It does not segfault on Windows. It seems Linux specific. I have not tested on Mac.
Exact steps for others to reproduce the error
Based on the default startup or an attached .blend file (as simple as possible).
Here is the
test.ifc
file you can download and adjust the path in the script below: test.ifcFor convenience, here is the Github link to the time range between the two commits. https:*github.com/blender/blender/commits/master?after=087f27a52f7857887e90754d87a7a73715ebc3fb+489&branch=master&qualified_name=refs%2Fheads%2Fmaster - to my untrained eye I don't see anything that jumps out as being significant. Also posted here to the primary developer of IfcOpenShell: https:*github.com/IfcOpenShell/IfcOpenShell/issues/2309
Added subscriber: @Moult
Added subscriber: @erik85
Added subscriber: @OmarEmaraDev
I can reproduce the issue with the stack. Not sure what is going on here though, investigating.
I don't know if this is relevant at all but I noticed
a02992f131
added these lines toblender.map
and the stack trace above mentions call_once as well.Added subscriber: @Carlos-Villagrasa
Hi, I've tried in Ubuntu 22.04 and somehow it doesn't segfault:
Very sorry!!! I called the wrong Blender executable. I can indeed reproduce the crash with Ubuntu 22.04:
Changed status from 'Needs Triage' to: 'Needs Developer To Reproduce'
Added subscribers: @xavierh, @Sergey, @brecht
@erik85 Good catch. Any idea if this is an issue that might caused by the change in the linker version script as mentioned by Erik above? @xavierh @brecht @Sergey
the change in blender.map was to solve this crash issue from Intel drivers when loaded from blender:
maybe more from std::call_once should be hidden to solve the issue here, either by listing what symbols are loaded from blender, using LD_DEBUG=all to list them precisely, or with D14971. Can someone trigger a build with D14971 in for testing?
@xavierh Triggerred the build: https://builder.blender.org/admin/#/builders/18/builds/546
Also if it helps, another developer https://github.com/IfcOpenShell/IfcOpenShell/issues/2309#issuecomment-1192418069 has been able to compile IfcOpenShell locally on Fedora and does not replicate the segfault, whereas the IfcOpenShell built by the IfcOpenShell build system does have the segfault. This suggests also a difference in build environment. He's posted some details about the build environment he is using in that link. Did the Blender build environment change between those two builds?
Thanks Sergey. @Moult can you try https://builder.blender.org/download/patch/blender-3.3.0-alpha+master-D14971.7dd88c80ab18-linux.x86_64-release.tar.xz ?
Since the change of build environment from ifcOpenShell seems to fix the issue introduced here, that fits well with the theory of symbols incompatibitilies.
For Blender it's CentOS 7 / GCC 9.3.1 / GLIBC 2.17 and it hasn't changed in between builds lately, do you know what IfcOpenShell build system environment is?
I tried with D14971 in Ubuntu 22.04 with the IfcOpenShell linked in (1) from Dion's first comment. No segfault this time.
@xavierh That's great! I can confirm that the D14971 build fixes the segfault. This is incredible at how fast this could be resolved (though I assume D14971 is not yet merged, so there are a few steps to go). Thank you so much everyone for the amazing response!
I'm not sure what the IfcOpenShell build system environment is, I've asked here.
I've rechecked D14971 and it currently breaks in Intel drivers when using oneAPI :
so it can't be applied just yet. In the meantime
may be the quickest way to go.
nevermind my previous message, it's fixed with newer drivers, so maybe D14971 should be the way to go.
@xavierh cheers, anything I can do to help?
This is the build environment for IfcOpenShell cross posted from the reply from Thomas Krijnen, the primary developer of IfcOpenShell:
"We compile on Ubuntu focal (20.04) with GCC 9.3.0.3 and GLIBC 2.31 (I think, I just searched the ubuntu package website)"
Changed status from 'Needs Developer To Reproduce' to: 'Confirmed'
I'll try to get D14971: Build: hide all symbols except a few required ones on Linux finished as a solution, it needs additional testing to ensure it doesn't break other things.
Will mark as high priority so we don't forget it for the 3.3 release.
This issue was referenced by
cfd16c04f8
Changed status from 'Confirmed' to: 'Resolved'