Cycles hybrid rendering + OIDN not detecting correct number of threads #85779

Closed
opened 2021-02-19 01:28:14 +01:00 by Daniel Salazar · 22 comments
Member

System Information
Operating system: Linux-5.8.0-7642-generic-x86_64-with-glibc2.32 64 Bits
Graphics card: GeForce GTX 1080/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 460.39

Blender Version
Broken: version: 2.93.0 Alpha, branch: master, commit date: 2021-02-18 19:59, hash: 27fd066baf

Short description of error
When using cycles hybrid rendering (CPU+GPU) + Open Image Denoise, the number of rendering buckets should be equal to the number of threads in the CPU. This is n-1 for CPU and 1 thread reserved for the GPU (Confirmed by Lord @brecht)

Instead of using the number of available threads, only the number of available cores is used.

For example, in a 6 Core/12 Thread i7 + GPU configuration, we get 6 rendering buckets instead of 12 (11 CPU + 1 GPU).
Or, in a 4 Core/8 Thread i7 + GPU configuration, we get 4 rendering buckets instead of 8 (7 CPU + 1 GPU)

Disable OID and it goes back to normal.

Tested using CUDA on Linux and Windows

**System Information** Operating system: Linux-5.8.0-7642-generic-x86_64-with-glibc2.32 64 Bits Graphics card: GeForce GTX 1080/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 460.39 **Blender Version** Broken: version: 2.93.0 Alpha, branch: master, commit date: 2021-02-18 19:59, hash: `27fd066baf` **Short description of error** When using cycles hybrid rendering (CPU+GPU) + Open Image Denoise, the number of rendering buckets should be equal to the number of threads in the CPU. This is n-1 for CPU and 1 thread reserved for the GPU (Confirmed by Lord @brecht) Instead of using the number of available **threads**, only the number of available **cores** is used. For example, in a 6 Core/12 Thread i7 + GPU configuration, we get 6 rendering buckets instead of 12 (11 CPU + 1 GPU). Or, in a 4 Core/8 Thread i7 + GPU configuration, we get 4 rendering buckets instead of 8 (7 CPU + 1 GPU) Disable OID and it goes back to normal. Tested using CUDA on Linux and Windows
Author
Member

Added subscribers: @brecht, @zanqdo

Added subscribers: @brecht, @zanqdo
Member

Added subscriber: @Alaska

Added subscriber: @Alaska
Member

Changed status from 'Needs Triage' to: 'Needs User Info'

Changed status from 'Needs Triage' to: 'Needs User Info'
Member

Just wanted to add that I was personally unable to replicate this issue with:

System Information:
Operating system: Linux-5.10.0-3-amd64-x86_64-with-glibc2.31 64 Bits
Graphics card: GeForce RTX 3070/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 460.39
Central Processing Unit: Ryzen 9 5950X

Blender Version:
Broken: version: 2.93.0 Alpha, branch: master, commit date: 2021-02-18 19:59, hash: 27fd066baf

with CUDA or OptiX.

@zanqdo, it's possible something may be wrong the build of Blender you have (or there may be an "issue" with the build of Blender I have), are you able to test with a Blender version downloaded from https://builder.blender.org/download/ if you haven't already?

It's also possible you may have some add-ons installed that are affecting this? Maybe try disabling all add-ons by loading factory defaults (Located in the menu File -> Defaults -> Load Factory Defaults), then configuring your GPU+CPU setup in settings, then rendering and seeing if that fixes it?

Another possible contributor could be the .blend file. Inspect the Performance tab in the Render properties panel and make sure the Threads Mode is set to Auto.

And I know this is a huge stretch, but it could be possible that you have hyperthreading disabled in the BIOS? It's also possible some other external application like "Thread Lasso" is affecting things, however that doesn't explain why it occurs on multiple OS.

I'm sorry, I'm just throwing ideas at the wall for why you may be experiencing this issue and seeing if any of them help.

Just wanted to add that I was personally unable to replicate this issue with: **System Information:** Operating system: Linux-5.10.0-3-amd64-x86_64-with-glibc2.31 64 Bits Graphics card: GeForce RTX 3070/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 460.39 Central Processing Unit: Ryzen 9 5950X **Blender Version:** Broken: version: 2.93.0 Alpha, branch: master, commit date: 2021-02-18 19:59, hash: `27fd066baf` with CUDA or OptiX. @zanqdo, it's possible something may be wrong the build of Blender you have (or there may be an "issue" with the build of Blender I have), are you able to test with a Blender version downloaded from https://builder.blender.org/download/ if you haven't already? It's also possible you may have some add-ons installed that are affecting this? Maybe try disabling all add-ons by loading factory defaults (Located in the menu `File -> Defaults -> Load Factory Defaults`), then configuring your GPU+CPU setup in settings, then rendering and seeing if that fixes it? Another possible contributor could be the .blend file. Inspect the `Performance` tab in the `Render properties` panel and make sure the `Threads Mode` is set to `Auto`. And I know this is a huge stretch, but it could be possible that you have hyperthreading disabled in the BIOS? It's also possible some other external application like "Thread Lasso" is affecting things, however that doesn't explain why it occurs on multiple OS. I'm sorry, I'm just throwing ideas at the wall for why you may be experiencing this issue and seeing if any of them help.
Author
Member

The windows build is indeed downloaded from the builder.

Also Auto detection is set and it detects correctly: 12 in the Linux box and 8 in the Windows box.

When I switch to CPU mode, it will use all available threads.

I've also tested on factory defaults :)

Maybe it's an Intel only problem?

The windows build is indeed downloaded from the builder. Also Auto detection is set and it detects correctly: 12 in the Linux box and 8 in the Windows box. When I switch to CPU mode, it will use all available threads. I've also tested on factory defaults :) Maybe it's an Intel only problem?
Member

In #85779#1115131, @zanqdo wrote:
Maybe it's an Intel only problem?

I really have no clue, and sadly I don't have an Intel CPU to test with. Will probably have to wait for others to report in on the issue.

In the meantime, are you able to run Blender with the debugger for Cycles? Here's how to do it on Linux:

  1. In the terminal run the command "/path/to/blender" --debug-cycles then go to a scene and render with CPU+GPU.
  2. After the scene has rendered for a few seconds, go back to the terminal and copy the output to a text file and upload it here. The most important part is probably the beginning, here's the beginning of mine:
I0219 22:26:26.839707 13939 blender_python.cpp:194] Debug flags initialized to:
CPU flags:
  AVX2       : True
  AVX        : True
  SSE4.1     : True
  SSE3       : True
  SSE2       : True
  BVH layout : EMBREE
  Split      : False
CUDA flags:
  Adaptive Compile : False
OptiX flags:
  CUDA streams : 1
OpenCL flags:
  Device type    : ALL
  Debug          : False
  Memory limit   : 0
I0219 22:26:33.289114 13939 device_cuda.cpp:41] CUEW initialization succeeded
I0219 22:26:33.289153 13939 device_cuda.cpp:43] Found precompiled kernels
I0219 22:26:33.360615 13939 device_cuda.cpp:176] Device has compute preemption or is not used for display.
I0219 22:26:33.360638 13939 device_cuda.cpp:179] Added device "GeForce RTX 3070" with id "CUDA_GeForce RTX 3070_0000:0a:00".
I0219 22:26:47.106143 14027 device.cpp:637] CPU render threads reduced from 32 to 31, to dedicate to GPU.
I0219 22:26:47.106189 14027 device_cpu.cpp:137] Will be using AVX2 kernels.
I0219 22:26:47.106487 14027 device_cuda_impl.cpp:714] Mapped host memory limit set to 29,382,348,800 bytes. (27.36G)
I0219 22:26:47.238039 14027 device.cpp:637] CPU render threads reduced from 32 to 31, to dedicate to GPU.
I0219 22:26:47.238111 14027 device.cpp:637] CPU render threads reduced from 32 to 31, to dedicate to GPU.
I0219 22:26:47.238687 14027 blender_sync.cpp:254] Total time spent synchronizing data: 0.000267982

In here you can see messages saying CPU render threads reduced from 32 to 31, to dedicate to GPU. Your Blender may be dedicating more threads to GPU than expected? Not sure.

I'm sorry, I'm not a developer and I actually don't know enough about Blender to properly look into this, I'm just trying to gather information so when a developer does look into it, they can quickly rule out somethings.

> In #85779#1115131, @zanqdo wrote: > Maybe it's an Intel only problem? I really have no clue, and sadly I don't have an Intel CPU to test with. Will probably have to wait for others to report in on the issue. In the meantime, are you able to run Blender with the debugger for Cycles? Here's how to do it on Linux: 1. In the terminal run the command `"/path/to/blender" --debug-cycles` then go to a scene and render with CPU+GPU. 2. After the scene has rendered for a few seconds, go back to the terminal and copy the output to a text file and upload it here. The most important part is probably the beginning, here's the beginning of mine: ``` I0219 22:26:26.839707 13939 blender_python.cpp:194] Debug flags initialized to: CPU flags: AVX2 : True AVX : True SSE4.1 : True SSE3 : True SSE2 : True BVH layout : EMBREE Split : False CUDA flags: Adaptive Compile : False OptiX flags: CUDA streams : 1 OpenCL flags: Device type : ALL Debug : False Memory limit : 0 I0219 22:26:33.289114 13939 device_cuda.cpp:41] CUEW initialization succeeded I0219 22:26:33.289153 13939 device_cuda.cpp:43] Found precompiled kernels I0219 22:26:33.360615 13939 device_cuda.cpp:176] Device has compute preemption or is not used for display. I0219 22:26:33.360638 13939 device_cuda.cpp:179] Added device "GeForce RTX 3070" with id "CUDA_GeForce RTX 3070_0000:0a:00". I0219 22:26:47.106143 14027 device.cpp:637] CPU render threads reduced from 32 to 31, to dedicate to GPU. I0219 22:26:47.106189 14027 device_cpu.cpp:137] Will be using AVX2 kernels. I0219 22:26:47.106487 14027 device_cuda_impl.cpp:714] Mapped host memory limit set to 29,382,348,800 bytes. (27.36G) I0219 22:26:47.238039 14027 device.cpp:637] CPU render threads reduced from 32 to 31, to dedicate to GPU. I0219 22:26:47.238111 14027 device.cpp:637] CPU render threads reduced from 32 to 31, to dedicate to GPU. I0219 22:26:47.238687 14027 blender_sync.cpp:254] Total time spent synchronizing data: 0.000267982 ``` In here you can see messages saying `CPU render threads reduced from 32 to 31, to dedicate to GPU.` Your Blender may be dedicating more threads to GPU than expected? Not sure. I'm sorry, I'm not a developer and I actually don't know enough about Blender to properly look into this, I'm just trying to gather information so when a developer does look into it, they can quickly rule out somethings.
Member

It's also implied that you are building Blender from source code on your Linux machine. If so, it may be possible to bisect the exact commit that causes the issue for you.
But first, you should make sure that it is caused by a recent change in Blender. The easiest way is to download Blender 2.91.2 or 2.83.X and test with that. https://www.blender.org/download/

If those versions of Blender aren't working, then it's possibly something wrong with your system? Have a look into doing a clean OS install and such?

If those versions of Blender are working as expected, then you can go onto finding the commit that's causing the issue for you. This can be done by bisecting.

Simple instructions are given for bisecting in the triaging playbook :

git bisect start

git bisect bad HEAD

git bisect good "GOOD-COMMIT"

For questions and assistance with that, check #blender-coders on blender.chat and point anyone to this reply.

Or I may be able to offer assistance.

It's also implied that you are building Blender from source code on your Linux machine. If so, it may be possible to bisect the exact commit that causes the issue for you. But first, you should make sure that it is caused by a recent change in Blender. The easiest way is to download Blender 2.91.2 or 2.83.X and test with that. https://www.blender.org/download/ If those versions of Blender aren't working, then it's possibly something wrong with your system? Have a look into doing a clean OS install and such? If those versions of Blender are working as expected, then you can go onto finding the commit that's causing the issue for you. This can be done by bisecting. Simple instructions are given for bisecting in the [triaging playbook ](https://wiki.blender.org/wiki/Process/Bug_Reports/Triaging_Playbook): ``` git bisect start git bisect bad HEAD git bisect good "GOOD-COMMIT" ``` For questions and assistance with that, check #blender-coders on blender.chat and point anyone to this reply. Or I may be able to offer assistance.

Added subscriber: @ephraimpauli

Added subscriber: @ephraimpauli

Do they render with the Open image Denoise?
That's what I had the problem with. It was rendered with only half as many threads as possible. But with the CPU everything worked. Problem solution was the Denoise Node in the Compositor.

Do they render with the Open image Denoise? That's what I had the problem with. It was rendered with only half as many threads as possible. But with the CPU everything worked. Problem solution was the Denoise Node in the Compositor.
Member

I can confirm what @ephraimpauli said, enabling Open Image Denoise in the render settings limits threads to half that of the CPU when rendering with GPU compute or hybrid rendering.

@zanqdo can you confirm this?

If this is the issue you're experiencing, then this seems to be expected? In this file (https://developer.blender.org/diffusion/B/browse/master/intern/cycles/blender/blender_sync.cpp$904) there is a comment that explains that Blender should Add additional denoising devices if we are rendering and denoising with different devices. Which probably includes adding half the thread count to denoising when rendering with a different device (E.G. GPU compute, even if it includes CPU via hybrid rendering).

I can confirm what @ephraimpauli said, enabling `Open Image Denoise` in the render settings limits threads to half that of the CPU when rendering with GPU compute or hybrid rendering. @zanqdo can you confirm this? If this is the issue you're experiencing, then this seems to be expected? In this file (https://developer.blender.org/diffusion/B/browse/master/intern/cycles/blender/blender_sync.cpp$904) there is a comment that explains that Blender should `Add additional denoising devices if we are rendering and denoising with different devices.` Which probably includes adding half the thread count to denoising when rendering with a different device (E.G. GPU compute, even if it includes CPU via hybrid rendering).
Author
Member

In #85779#1115472, @ephraimpauli wrote:
Do they render with the Open image Denoise?
That's what I had the problem with. It was rendered with only half as many threads as possible. But with the CPU everything worked. Problem solution was the Denoise Node in the Compositor.

OMG That was it! Is this intentional?

This totally kills the performance without the user having any clue of the cause.

> In #85779#1115472, @ephraimpauli wrote: > Do they render with the Open image Denoise? > That's what I had the problem with. It was rendered with only half as many threads as possible. But with the CPU everything worked. Problem solution was the Denoise Node in the Compositor. OMG That was it! Is this intentional? This totally kills the performance without the user having any clue of the cause.
Daniel Salazar changed title from Cycles hybrid rendering not detecting correct number of threads to Cycles hybrid rendering + OID not detecting correct number of threads 2021-02-19 21:11:03 +01:00
Evan Wilson changed title from Cycles hybrid rendering + OID not detecting correct number of threads to Cycles hybrid rendering + OIDN not detecting correct number of threads 2021-02-22 20:22:39 +01:00
Member

Added subscriber: @lichtwerk

Added subscriber: @lichtwerk
Member

Changed status from 'Needs User Info' to: 'Confirmed'

Changed status from 'Needs User Info' to: 'Confirmed'
Member

Also seeing this.
Not totally sure if this has an underlying reason, but if so, should be reflected in the UI somehow I guess.

Also seeing this. Not totally sure if this has an underlying reason, but if so, should be reflected in the UI somehow I guess.

Added subscriber: @DanielSilva-1

Added subscriber: @DanielSilva-1

Getting the same problem. Tried forcing more threads manually but it didn't work. Same bevaviour using the number of cores and not the number of threads.

Getting the same problem. Tried forcing more threads manually but it didn't work. Same bevaviour using the number of cores and not the number of threads.
Member

Added subscriber: @ChengduLittleA

Added subscriber: @ChengduLittleA
Member

With OpenImageDenoise on it's the same problem. The UI shows 8 threads but only 4 were used during render.

With OpenImageDenoise on it's the same problem. The UI shows 8 threads but only 4 were used during render.

This issue was referenced by blender/cycles@94f375d56c

This issue was referenced by blender/cycles@94f375d56c1d0fd4184115f56cbf8626820dfb37

This issue was referenced by abc3128011

This issue was referenced by abc3128011b484c270701211b40831d11c8ac44b

Changed status from 'Confirmed' to: 'Resolved'

Changed status from 'Confirmed' to: 'Resolved'
Brecht Van Lommel self-assigned this 2021-10-19 11:43:39 +02:00

In #85779#1115623, @Alaska wrote:
If this is the issue you're experiencing, then this seems to be expected? In this file (https://developer.blender.org/diffusion/B/browse/master/intern/cycles/blender/blender_sync.cpp$904) there is a comment that explains that Blender should Add additional denoising devices if we are rendering and denoising with different devices. Which probably includes adding half the thread count to denoising when rendering with a different device (E.G. GPU compute, even if it includes CPU via hybrid rendering).

To be clear, this is analysis is wrong, there's no logic in Cycles to dedicate half the threads to denoising.

> In #85779#1115623, @Alaska wrote: > If this is the issue you're experiencing, then this seems to be expected? In this file (https://developer.blender.org/diffusion/B/browse/master/intern/cycles/blender/blender_sync.cpp$904) there is a comment that explains that Blender should `Add additional denoising devices if we are rendering and denoising with different devices.` Which probably includes adding half the thread count to denoising when rendering with a different device (E.G. GPU compute, even if it includes CPU via hybrid rendering). To be clear, this is analysis is wrong, there's no logic in Cycles to dedicate half the threads to denoising.
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
8 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#85779
No description provided.