Cycles: multi-device rendering performance #89833
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
15 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#89833
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
An initial work on the multi-device rendering has been done, bringing all the required building blocks in place. However, actual balancing of the amount of work on the devices is to be improved.
There two main issues with the current logic:
Tweaking headless scheduling is simple, but from own experiments it is really important to schedule more or less equally complex pixels to the devices, otherwise the balance will not converge fast enough (or will never converge).
There are few ideas to try:
The best scene which demonstrates shortcomings of the current approach is the
pabellon.blend
in the F12 render (it behaves OK in headless, because half-decent balance after first sample is good enough, and subsequent re-balancing do not happen often, and hence do not hurt the performance).Various open related topics:
Added subscriber: @Sergey
This issue was referenced by 289a173d938f26114a837bb34742c4ee6792d0ff
Added subscriber: @SteffenD
Some of the logic has been tweaked in the 289a173d93, which helped a lot to avoid huge speed regreesion in the Pabellon scene when fast GPU is used together with slow CPU.
There is still some performance penalty for such unbalanced configuration. Not sure what would be proper solution for that. In a way the tile-based rendering was dealing with such configuration better (smaller schedule units combined with work stealing), but it also was not keeping GPUs really busy. So here is a tradeoff between reacting quick enough for unbalanced configuration and keeping devices always occupied.
Would be interesting to test whether the observed slowdown is more of a constant time (some time penalty to balance things out happening during first samples of render: which means the percentage of penalty goes down when adding more samples) or whether it is a constant percentage from the overall render time (which means percentage of penalty does not go down with more samples added).
Added subscriber: @ParallelMayhem
Added subscriber: @ericspeer70
Thanks for all the hard work everyone!
Now that we are doing 'Multi-device rendering' again...
Can we please include an option for animations to do one frame per device?
I think this will improve performance as a faster device can do multiple frames while the slower devices plod along with their single frames.
Thanks for taking time to listen. Take care and stay safe, E:)
That's an interesting idea, but is outside of the scope of the current Cycles X development. Is also something what is usually implemented as part of a render farm software.
Okay Sergey thanks for the reply.
I have raised the issue with the folks at CrowdRender.
But they are only able to do Tile Sharing, which as was previously mentioned is precarious due to the variety of complexity from one frame to the next.
A lot of wasted potential.
The reason for my initial post was because previously Blender had its own Network Render ability built in and I was hoping that might be the case soon again.
Thanks again for all the talent and inspiration, E:)
Added subscriber: @brecht
From the latest meeting notes:
https://devtalk.blender.org/t/2021-09-14-blender-rendering-meeting/20469
The cause of this is likely different than the issues mentioned in this task, but we have to find out what it is exactly.
A quick way to check whether it is a different problem or not is to disable rebalancing (should be as easy as early return in
PathTrace::rebalance
) and render an uniform image (so that the complexity of slices is roughly the same.It could also be non-ideal tile size calculated for a narrow slice in the
tile_calculate_best_size
. Easiest I think would be to render a higher res image to force slices to be taller.Could also be something outside of our control, like, a lock in the driver which we hit much more often with all the micro-kernel enqueuing.
Added subscriber: @Raimund58
Added subscriber: @2046411367
Added subscriber: @MilanJaros
This comment was removed by @MilanJaros
This comment was removed by @MilanJaros
Added subscriber: @Sayak-Biswas
Added subscriber: @easythrees
Cycles X - Multi-device rendering performanceto Cycles: multi-device rendering performanceAdded subscriber: @lrevardel
Added subscriber: @Garek
Added subscriber: @AndrewPrice
Any updates on this?
Some of us spent 20K buying 4 3090s and would love to use it in a single Blender instance 😅
There are some work-in-progress development in D14014 and D14083 (with the corresponding task for latter #95687).
They both have some short-comings, and finding the best solution is not trivial and takes time.
Thanks for sharing the WIPs. I'll track the progress there.
Added subscriber: @Eki-Oshri