TBBMalloc_proxy is failing on windows 10. #88813

Closed
opened 2021-06-03 21:27:03 +02:00 by Ray molenkamp · 13 comments
Member

tbb's allocator hooks are failing to attach to the CRT on windows 10, windows 8.1 (and for 2.83 win 7) are working as expected, this problem exists in 2.83 (tbb2019), 2.93(tbb2020) and 3.0(tbb2020) leading to a performance regression, I don't have any numbers on hand, but @erik85 may be able to provide those.

The root cause seems to be some kind of special treatment of ucrtbase.dll by windows 10

Here's the setup on Win 7/8.1

  • We ship the CRT in a blender.crt folder and use a manifest to redirect any searches for the DLL's in question to this folder
  • Blender loads and tries loading the crt
  • The manifest gets read
  • The redirects are respected and the dll's are found in the blender.crt folder
  • TBBMalloc_proxy asks for the module handle to ucrtbase.dll (note filename only, no path), GetModuleHandle queries the manifest, realizes the redirect and gives the handle to blender.crt/ucrtbase.dll
  • TBBMalloc_proxy hooks the dll
  • Life is good

Here's the setup on Win 10

  • We ship the CRT in a blender.crt folder and use a manifest to redirect any searches for the DLL's in question to this folder
  • Blender loads and tries loading the crt
  • The manifest gets read
  • The OS goes redirect for ucrtbase.dll? LOL! NO! and loads ucrtbase.dll from the system32 folder (UCRT ships with the OS on win10, no redist is needed anymore)
  • TBBMalloc_proxy asks for the module handle to ucrtbase.dll (note filename only, no path), GetModuleHandle queries the manifest, realizes the redirect and sees if blender.crt/ucrtbase.dll is loaded, which it is not, and returns null
  • TBBMalloc_proxy can't find any CRT to hook into and just silently does nothing.
  • Life looks good, but performance is not as good as it could have been.

Who's at fault?

tbb is in the clear, not their fault, they did nothing wrong, beyond that it's a tossup between us packaging the CRT in a non standard way and Win10 having some odd behavior here.

Solutions

1- Revert the DLL redirection

We remove the manifest all together and move the 50 dll files back into the main blender folder, users have complained about the "mess" in the blender folder though.

2- We take ucrtbase.dll out of blender.crt and the manifest

This leads to one extra dll in the blender folder, windows10 will still ignore that DLL and load the copy from System32 so no harm there.

3- We patch TBB

We can patch tbb with a small patch that if GetModuleHandle fails, to have a look in the system32 folder with a full path specified, by sticking these 10 lines over here

 if (!module)
    { 
        char sys_path[MAX_PATH];
        char dll_path[MAX_PATH];
        GetSystemDirectoryA(sys_path, MAX_PATH);
        sprintf(dll_path, "%s\\%s", sys_path, dllName);
        module = GetModuleHandle(dll_path);
        if (!module)
        {
            return false;
        }
    }

4- We stop shipping the CRT

This would cause issues for win 8.1 users, that now will have to install the ucrt-redist, we'll have to support 8.1 till early 2023 but beyond that "shipping the CRT" with blender should no longer be needed

5- Do nothing

Doing nothing is always an option, the standard allocator isn't super great in multithreaded environments, we added it to help with the cycles BVH build that was heavy on many small allocations, but that has moved since to embree, so that does not seem to have that issue. the new GMP based boolean code and mantaflow will suffer performance issues until we stop shipping the CRT in 2023

Option 2 seems to be the least intrusive one we can easily roll out in all 3 branches, followed by 1, patching tbb (3) will be more work since it involves rebuilding tbb twice (tbb2019/2020) , 4 and 5 are off the table for me personally, but if we decide that's what we'll do I won't throw a fit about it :)

on top of solving the issue, adding a test that tbbmalloc is functioning properly may not be the worst idea

@brecht : thoughts?

tbb's allocator hooks are failing to attach to the CRT on windows 10, windows 8.1 (and for 2.83 win 7) are working as expected, this problem exists in 2.83 (tbb2019), 2.93(tbb2020) and 3.0(tbb2020) leading to a performance regression, I don't have any numbers on hand, but @erik85 may be able to provide those. The root cause seems to be some kind of special treatment of ucrtbase.dll by windows 10 Here's the setup on Win 7/8.1 - We ship the CRT in a `blender.crt` folder and use a manifest to redirect any searches for the DLL's in question to this folder - Blender loads and tries loading the crt - The manifest gets read - The redirects are respected and the dll's are found in the `blender.crt` folder - TBBMalloc_proxy asks for the module handle to `ucrtbase.dll` (note filename only, no path), GetModuleHandle queries the manifest, realizes the redirect and gives the handle to `blender.crt/ucrtbase.dll` - TBBMalloc_proxy hooks the dll - Life is good Here's the setup on Win 10 - We ship the CRT in a `blender.crt` folder and use a manifest to redirect any searches for the DLL's in question to this folder - Blender loads and tries loading the crt - The manifest gets read - The OS goes redirect for `ucrtbase.dll`? LOL! NO! and loads `ucrtbase.dll` from the system32 folder (UCRT ships with the OS on win10, no redist is needed anymore) - TBBMalloc_proxy asks for the module handle to `ucrtbase.dll` (note filename only, no path), GetModuleHandle queries the manifest, realizes the redirect and sees if `blender.crt/ucrtbase.dll` is loaded, which it is not, and returns null - TBBMalloc_proxy can't find any CRT to hook into and just silently does nothing. - Life looks good, but performance is not as good as it could have been. ## Who's at fault? tbb is in the clear, not their fault, they did nothing wrong, beyond that it's a tossup between us packaging the CRT in a non standard way and Win10 having some odd behavior here. ## Solutions **1- Revert the DLL redirection** We remove the manifest all together and move the 50 dll files back into the main blender folder, users have complained about the "mess" in the blender folder though. **2- We take ucrtbase.dll out of `blender.crt` and the manifest** This leads to one extra dll in the blender folder, windows10 will still ignore that DLL and load the copy from System32 so no harm there. **3- We patch TBB** We can patch tbb with a small patch that if GetModuleHandle fails, to have a look in the system32 folder with a full path specified, by sticking these 10 lines [over here](https://github.com/oneapi-src/oneTBB/blob/master/src/tbbmalloc_proxy/proxy.cpp#L656) ``` if (!module) { char sys_path[MAX_PATH]; char dll_path[MAX_PATH]; GetSystemDirectoryA(sys_path, MAX_PATH); sprintf(dll_path, "%s\\%s", sys_path, dllName); module = GetModuleHandle(dll_path); if (!module) { return false; } } ``` **4- We stop shipping the CRT** This would cause issues for win 8.1 users, that now will have to install the ucrt-redist, we'll have to support 8.1 till early 2023 but beyond that "shipping the CRT" with blender should no longer be needed **5- Do nothing** Doing nothing is always an option, the standard allocator isn't super great in multithreaded environments, we added it to help with the cycles BVH build that was heavy on many small allocations, but that has moved since to embree, so that does not seem to have that issue. the new GMP based boolean code and mantaflow will suffer performance issues until we stop shipping the CRT in 2023 Option 2 seems to be the least intrusive one we can easily roll out in all 3 branches, followed by 1, patching tbb (3) will be more work since it involves rebuilding tbb twice (tbb2019/2020) , 4 and 5 are off the table for me personally, but if we decide that's what we'll do I won't throw a fit about it :) on top of solving the issue, adding a test that tbbmalloc is functioning properly may not be the worst idea @brecht : thoughts?
Ray molenkamp self-assigned this 2021-06-03 21:27:03 +02:00
Author
Member

Changed status from 'Needs Triage' to: 'Confirmed'

Changed status from 'Needs Triage' to: 'Confirmed'
Author
Member

Added subscribers: @erik85, @brecht, @LazyDodo

Added subscribers: @erik85, @brecht, @LazyDodo

Added subscriber: @rjg

Added subscriber: @rjg

This is numbers from a test blend-file I have, baking a Mantaflow mesh. Total CPU time in all threads is 518s vs 481s. Of course this all depends on how memory intesnsive the operation is.

bild.png

bild.png

This is numbers from a test blend-file I have, baking a Mantaflow mesh. Total CPU time in all threads is 518s vs 481s. Of course this all depends on how memory intesnsive the operation is. ![bild.png](https://archive.blender.org/developer/F10156635/bild.png) ![bild.png](https://archive.blender.org/developer/F10156638/bild.png)

This comment was removed by @erik85

*This comment was removed by @erik85*
Member

Added subscriber: @EAW

Added subscriber: @EAW
Member

To be clear, #2 works for Windows 10 because there is no line in the manifest directing GetModuleHandle to point TBBMalloc_proxy inside of the blender.crt directory. Instead, GetModuleHandle points TBBMalloc_proxy to the loaded ucrtbase.dll inside of the system32 folder, right?

Upon first read, Option 2 came across as “we make this change, and nothing happens.” If the confusion is only on my end, I would like to say that it is 3am here, in my defense. 😴

To be clear, #2 works for Windows 10 because there is no line in the manifest directing GetModuleHandle to point TBBMalloc_proxy inside of the blender.crt directory. Instead, GetModuleHandle points TBBMalloc_proxy to the loaded `ucrtbase.dll` inside of the system32 folder, right? Upon first read, Option 2 came across as “we make this change, and nothing happens.” If the confusion is only on my end, I would like to say that it is 3am here, in my defense. 😴

Option 2 seems totally reasonable. One extra dll in the folder is no big deal.

Option 2 seems totally reasonable. One extra dll in the folder is no big deal.

This issue was referenced by 531e4fcf3e

This issue was referenced by 531e4fcf3ef5b9ab3cd3a04f2d8a24d967abfce5

This issue was referenced by db909281f2

This issue was referenced by db909281f2816063828019d47c22d731f7ee94b8

This issue was referenced by bfaf09b5bc

This issue was referenced by bfaf09b5bc97897eecf96cbc1b7aa46e6b38b4da
Author
Member

Changed status from 'Confirmed' to: 'Resolved'

Changed status from 'Confirmed' to: 'Resolved'

1- Revert the DLL redirection

We remove the manifest all together and move the 50 dll files back into the main blender folder, users have complained about the "mess" in the blender folder though.

Why not just put the executables and dlls to bin folder instead of the root?

> 1- Revert the DLL redirection > > We remove the manifest all together and move the 50 dll files back into the main blender folder, users have complained about the "mess" in the blender folder though. Why not just put the executables and dlls to `bin` folder instead of the root?
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
7 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#88813
No description provided.