Blender hangs forever when starting with particular userpref.blend config file (SDL and Pulse audio issue?) #126661
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
9 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#126661
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Operating system: Linux-6.8.0-41-generic-x86_64-with-glibc2.39 64 Bits, X11 UI
Graphics card: NVIDIA GeForce GTX 1060 6GB/PCIe/SSE2 NVIDIA Corporation 4.6.0 NVIDIA 535.183.01
Broken: version: 4.2.1 LTS, branch: blender-v4.2-release, commit date: 2024-08-19 11:21, hash:
396f546c9d82
(as well as earlier versions)
Attached userpref.blend file causes blender to hang indefinitely on startup. This happens for 3.x versions of blender as well.
I don't know how/if my userpref.blend got "corrupted" (if that is accurate) but I can't start blender with it in place. (Regardless of the state/quality of the userpref.blend file, it seems like a bug that blender would hang on startup with a bad userpref.blend, so I file this bug.)
To reproduce:
...blender then hangs indefinitely*. No CPU usage, no windows open, etc.
*I have waited several minutes to see if it starts, but not longer than that.
Hi, thanks for the report. Attached userpref.blend works fine with 4.2 configs.
I think the freeze on your end is likely due to the add-ons (blenderkit, machin2tools.etc). Do you have them in
/scripts
folder?Thanks -- the hang happens as described in the step-by-step above -- if I have a fresh config directory (e.g. no 4.2 directory exists, I start 4.2, I do not import anything, then exit 4.2, there is now a 4.2 config directory) and then only copy in the userpref.blend file (no addons, etc) and try to start, the hang happens.
If instead I start 4.2 without a config directory, and choose to "import Blender 4.0 Preferences", it hangs immediately (presumably because it is restarting in some sense). I kill it, and in that case all the add-ons are present in the scripts directory, and if I try to run it again, the hang is the same.
Maybe due to the asset library?: (path is
/home/casey/Documents/Blender/Assets
)Hmmm... no ~/Documents/Blender directory is present.
@lichtwerk hi, can you replicate on linux?
Yes can confirm (hangs here as well).
Also in 4.1.1
4.3 seems fine though
If I use the file in 4.3, I am seeing the following
But I have disabled them all, resaved the prefs and it is still an issue.
Might have a look when that got "fixed"
@chconnor : can you confirm that using those prefs for 4.3 from https://builder.blender.org/download/daily/ is not an issue anymore?
Hold on, it seems the buildbot build is also not working... but my local build does...
Since I can only repro with buildbot builds, not even sure how to debug this furhter, maybe @mont29 or @ideasman42 have an idea?
Would this be Core module responsibility @mont29 ? (will set this as a placeholder module for now...)
I... have absolutely no idea what to do with this one... On linux, the
blender
process seems to enter some sort of deadlock, or maybe infinite wait trying to read/access some remote non-existent data? It does not reacts to signals at least, and needs to be forcefullykill
ed.Here is a backtrace of all threads from gdb, in the locked situation (using official 4.2.1 build), using
gdb --args ./blender -t 1
to limit the amount of active threads:Actually it seems to be the SDL audio backend. Once I switch to e.g.
None
, it seems that there is no more issues.From the backtrace above it looks like both SDL and Pulse are trying to run ?
Blender hangs forever when starting with particular userpref.blend config fileto Blender hangs forever when starting with particular userpref.blend config file (SDL and Pulse audio issue?)@lichtwerk --
Hmm, no: I downloaded from here: https://builder.blender.org/download/daily/ (reference
6bd515e0d2
), started it, did not import configs, then copied the userpref.blend into the ~/.config/blender/4.3/config/. directory, and it hangs on startup.Let me know if there is anything else I can do to help!
This issue happens on my system, I tried to get to the bottom of this but only managed to narrow this down to a conflict between SDL & USD.
Here are some findings:
WITH_SDL_DYNLOAD=OFF
).WITH_USD=OFF
).WITH_PULSEAUDIO=OFF
(where pulse-audio is only activated via SDL).Details:
Checking the library symbols, it doesn't seem as if there are conflicts between (
libusd_ms.so
&libSDL.so
).Hanging occurs when linking with
libusd_ms.so
(bundled libraries and arch-linux'susd
package).Even with
WITH_USD=OFF
the hang occurs when manually linkinglibusd_ms.so
:The issue occurs with the libSDL.so from Arch Linux as well as a build (SDL's
SDL2
branch4eac44bed446f1ce9083b765dd9744b8ca81497e
).From adding break-points to pulse-audio's initialization functions. It doesn't seem that linking
libusd_ms.so
causes additional function calls, linking it for some reason causes the hang.When linking libusd, initializing SDL & pulseaudio hangs. The second thread may be related.
Update
Some additional tests (without success).
libusd_ms.so
still has the problem:libusd_ms.so
defines doesn't resolve the issue:Generate a list of non C++ names:
Run with attached
patchelf.map
.This also didn't solve the problem, so it seems likely the problem is caused by logic that executes when the library is loaded instead of being a symbol conflict (SDL and Pulse are C only so C++ symbols shouldn't conflict).
@deadpin @makowalski would you know if
libusd_ms.so
interacts with the sound systems (SDL and/or pulse) in any way?Not that I'm aware, but I don't know for sure. For what it's worth, there is no obvious dependency that I can see in the USD library code base, but I haven't done an exhaustive search. I'll report back if anything occurs to me.
I also don't believe USD will use or touch (directly) SDL or Pulse. But USD does have quite a bit of code they execute on library load.
Everything that is defined as a
ARCH_CONSTRUCTOR
in their source is potentially executed. On inspection I don't see anything that would obviously impact external entities though.If I had to guess at where to start, maybe try commenting out the code inside the following 2 locations to see if it can at least get past the deadlock (USD would not function correctly but hopefully it gets past the deadlock to know for sure):
<usd path>/pxr/base/arch/initConfig.cpp
-- the code insideARCH_CONSTRUCTOR(Arch_InitConfig, 2, void)
(need to keep at least the call toArch_InitTmpDir
though)<usd path>/pxr/base/plug/initConfig.cpp
-- the code insideARCH_CONSTRUCTOR(Plug_InitConfig, 2, void)
@deadpin
Tried early returning from every
ARCH_CONSTRUCTOR
, but the issue remains.Tried building non-monolithic libraries, the issue remains with the following libraries.
libtbb_debug.so
libtbbmalloc_debug.so
libtbbmalloc_proxy_debug.so
libtbbmalloc_proxy.so
libtbbmalloc.so
libtbb.so
libusd_arch.so
libusd_ar.so
libusd_gf.so
libusd_js.so
libusd_kind.so
libusd_ndr.so
libusd_pcp.so
libusd_pegtl.so
libusd_plug.so
libusd_sdf.so
libusd_sdr.so
libusd_tf.so
libusd_trace.so
libusd_ts.so
libusd_usdGeom.so
libusd_usdHydra.so
libusd_usdLux.so
libusd_usdMedia.so
libusd_usdPhysics.so
libusd_usdProc.so
libusd_usdRender.so
libusd_usdRi.so
libusd_usdShade.so
libusd_usdSkel.so
libusd_usd.so
libusd_usdUI.so
libusd_usdUtils.so
libusd_usdVol.so
libusd_vt.so
libusd_work.so
The issue seems to be that SDL tries and fails to create a thread with a stack size of
256 * 1024
bytes inPULSEAUDIO_DetectDevices
; that thread is then supposed to signal the semaphore on which SDL hangs. The thread creation fails because the requested stack is too small to fit the thread descriptor. It doesn't fit because USD has a large amount of thread local storage, which is stored on-stack in the thread descriptor for non-main threads.The total required TLS space is 290,560 bytes, of which 278,968 is from
libusd_ms.so
,readelf -Wl libusd_ms.so
:Increasing the stack size in
PULSEAUDIO_DetectDevices
from256 * 1024
to512 * 1024
allows Blender to launch, although maybe USD shouldn't use so much TLS space.Excellent find there Jorn. Thank you for investigating the root cause.
I suppose there's a few follow ups then:
Prim path cache
which is like 256kb big. They are really adamant about using such things but I can ask if they're willing to heap alloc instead of using the stack [2]. I haven't filed the issue for them yet though.[1] https://github.com/libsdl-org/SDL/issues/10806
[2] https://github.com/PixarAnimationStudios/OpenUSD/blob/release/pxr/usd/sdf/path.cpp#L819
@jorn interesting, out of curiosity - how did you manage to find this was caused by an issue with the stack size?
@deadpin nice to see the issue has been resolved in SDL2 already.
Suggest to file an issue with USD to heap allocate
Prim path cache
since this seems like an issue that could bite us again in some other unrelated situations.@ideasman42 The backtraces showed that the main thread was stuck on a semaphore in SDL. Putting a breakpoint at the start of the function that's supposed to signal the semaphore,
HotplugThread
, showed that it was not getting called. I then stepped intoSDL_CreateThreadInternal
inPULSEAUDIO_DetectDevices
to find out why, where in the pthread implementation ofSDL_SYS_CreateThread
it returned an error after callingpthread_create
, which returned EINVAL:After a few tries stepping into
pthread_create
and some instruction stepping showed it returning EINVAL inallocate_stack
here:This check failed because
tls_static_size_for_stack
was very large, with the requested stack size,size
, being too small to fit everything. I then found A Deep dive into (implicit) Thread Local Storage which explained that 'static' TLS was allocated on the stack for non-main threads. Using readelf confirmed it was primarily USD that madetls_static_size_for_stack
so big.The reason that the hang doesn't happen with
WITH_SDL_DYNLOAD=OFF
is that the version of SDL2 inlib
(2.28.2) is from before the semaphore inPULSEAUDIO_DetectDevices
was added (2.30.0, commit), butHotplugThread
still fails to run for the same reason as above. See repology for which distros have this version or newer.@jorn thanks for the detailed explanation, it may help when investigating similar issues in the future.
Dynamic SDL loading has been removed & SDL disabled for official releases making this particular bug no longer applicable:
c6afb0e270
.See https://devtalk.blender.org/t/sdl-support-to-be-disabled-in-release-builds/36564
Closing.
Blender is the best. :-) I take it the next official release should be free of this issue, then?
Thanks everyone!
@clepsydrae yes, however this was done by removing SDL support as we didn't have a compelling reason to keep it - given the alternatives we support.