Utilization of 2 or more GPU´s/GTX750ti #40027

Closed
opened 2014-05-04 23:03:02 +02:00 by Rolf · 17 comments

System Information
Windows 7 64bit
Intel I7-920 3,5ghz 16gb 2x GTX750ti

Blender Version
Broken: 2.70.xx, testet on fd80ac4
Worked: 2.69.11 eb4f2b4

Short description of error
Using more then one GPU, the utilization of both cards are bad.
gpu_load.jpg
In 2.69.11 eb4f2b4, i get speedup of ~90%-97%
In 2.70.5 fd80ac4 ~33%-60%
I think it has to do with this commit 1d016758

I made tests on Blenderartists, using mib2berlin´s Cornellbox 2.7x benchmark.
http://blenderartists.org/forum/showthread.php?327909-Cycles-NVidia-MAXWELL-Benchmarks&p=2639500&viewfull=1#post2639500

Exact steps for others to reproduce the error
Render with 2 GPU´s, look at the utilization of both cards.
www.blenderartists.org/forum/attachment.php?attachmentid=297482&d=1395468047

**System Information** Windows 7 64bit Intel I7-920 3,5ghz 16gb 2x GTX750ti **Blender Version** Broken: 2.70.xx, testet on fd80ac4 Worked: 2.69.11 eb4f2b4 **Short description of error** Using more then one GPU, the utilization of both cards are bad. ![gpu_load.jpg](https://archive.blender.org/developer/F86936/gpu_load.jpg) In 2.69.11 eb4f2b4, i get speedup of ~90%-97% In 2.70.5 fd80ac4 ~33%-60% I think it has to do with this commit 1d016758 I made tests on Blenderartists, using mib2berlin´s Cornellbox 2.7x benchmark. http://blenderartists.org/forum/showthread.php?327909-Cycles-NVidia-MAXWELL-Benchmarks&p=2639500&viewfull=1#post2639500 **Exact steps for others to reproduce the error** Render with 2 GPU´s, look at the utilization of both cards. www.blenderartists.org/forum/attachment.php?attachmentid=297482&d=1395468047
Author

Changed status to: 'Open'

Changed status to: 'Open'
Martijn Berger was assigned by Rolf 2014-05-04 23:03:02 +02:00
Author

Added subscriber: @RolfJawarsch

Added subscriber: @RolfJawarsch

Added subscriber: @ThomasDinges

Added subscriber: @ThomasDinges
Member

Are both card dedicated to rendering or is one driving the display ?

We use different cuda command to try and keep the card as busy as possible depending on this.

How big is your tile size and how fast does one tile complete?

the asynchronous (2nd card and beyond) is depending on scheduling just enough work so the GPU comes back about every 1 second. the 1st card ( only if it drives the display) one CPU core is tasked to busy-waiting for the card to finish.

Are both card dedicated to rendering or is one driving the display ? We use different cuda command to try and keep the card as busy as possible depending on this. How big is your tile size and how fast does one tile complete? the asynchronous (2nd card and beyond) is depending on scheduling just enough work so the GPU comes back about every 1 second. the 1st card ( only if it drives the display) one CPU core is tasked to busy-waiting for the card to finish.
Author

Yes, one card is for driving the display.

Tilessize in this case was 256x256, but its also on 128/128 or 512x256 etc.
The greater the tilesize the worse the utilization.
The smaller the tilesize the longer the overall rendertime.

Each tile render different times by every render attempt with 2xGPU+busywait commit, it's hard to say, but i´ve tested it.
Mib´s Cornellbox for 2.7
2.70.5 fd80ac4

Rendertimes 2:30-2:42 4 Tiles@256x256 2xGPU
1.Top right: ~1:02 GPU1
2.Bottom right: ~1:19 GPU2
3.Top left: ~1:23 GPU1
4.Bottom left: ~1:22 GPU2

Rendertimes 3:44 4 Tiles@256x256 1xGPU
1.Top right: ~0:52
2.Bottom right: ~0:55
3.Top left: ~0:56
4.Bottom left: ~1:02

2.70.5 fd80ac4 without busywait commit
Rendertimes 1:58 4 Tiles@256x256 2xGPU
1.Top right: ~0:55 GPU1
2.Bottom right: ~0:56 GPU2
3.Top left: ~0:55 GPU1
4.Bottom left: ~1:03 GPU2

Rendertimes 3:52 4 Tiles@256x256 1xGPU
1.Top right: ~0:56
2.Bottom right: ~0:56
3.Top left: ~0:57
4.Bottom left: ~1:03

Yes, one card is for driving the display. Tilessize in this case was 256x256, but its also on 128/128 or 512x256 etc. The greater the tilesize the worse the utilization. The smaller the tilesize the longer the overall rendertime. Each tile render different times by every render attempt with 2xGPU+busywait commit, it's hard to say, but i´ve tested it. Mib´s Cornellbox for 2.7 2.70.5 fd80ac4 Rendertimes 2:30-2:42 4 Tiles@256x256 2xGPU 1.Top right: ~1:02 GPU1 2.Bottom right: ~1:19 GPU2 3.Top left: ~1:23 GPU1 4.Bottom left: ~1:22 GPU2 Rendertimes 3:44 4 Tiles@256x256 1xGPU 1.Top right: ~0:52 2.Bottom right: ~0:55 3.Top left: ~0:56 4.Bottom left: ~1:02 2.70.5 fd80ac4 without busywait commit Rendertimes 1:58 4 Tiles@256x256 2xGPU 1.Top right: ~0:55 GPU1 2.Bottom right: ~0:56 GPU2 3.Top left: ~0:55 GPU1 4.Bottom left: ~1:03 GPU2 Rendertimes 3:52 4 Tiles@256x256 1xGPU 1.Top right: ~0:56 2.Bottom right: ~0:56 3.Top left: ~0:57 4.Bottom left: ~1:03
Member

Added subscriber: @brecht

Added subscriber: @brecht
Member

I am not getting this behavior, thx for the detailed info though.

@ThomasDinges I think I need to talk with @brecht to see if we can further reduce the sync events and or if driving one GPU with sync api and another with async might be the cause.

I am not getting this behavior, thx for the detailed info though. @ThomasDinges I think I need to talk with @brecht to see if we can further reduce the sync events and or if driving one GPU with sync api and another with async might be the cause.
Member

Added subscriber: @plasmasolutions

Added subscriber: @plasmasolutions

Added subscriber: @Tibi-4

Added subscriber: @Tibi-4

Hi,

I’ve tested with two 750Ti cards not connected to any display, blender-2.70-30361a7-win64-vc12 and I get the following times for the cornell_bench_27:
One card 4’19"
Two cards 2’11"

Cheers!

Hi, I’ve tested with two 750Ti cards not connected to any display, blender-2.70-30361a7-win64-vc12 and I get the following times for the cornell_bench_27: One card 4’19" Two cards 2’11" Cheers!
Member

It looks to me like mixing the synchronous and the asynchronous API has some additional downside.
This also implies that the current state is not really a working one.

It looks to me like mixing the synchronous and the asynchronous API has some additional downside. This also implies that the current state is not really a working one.

Added subscriber: @MartijnBerger

Added subscriber: @MartijnBerger

@MartijnBerger: do you think reverting 39bfde674c could solve the problem?

Not sure what else we can do here besides reverting the entire async system, I don't have the hardware to test this properly. And there's other complaints about updates taking too long, for which a more advanced system is probably needed.

@MartijnBerger: do you think reverting 39bfde674c could solve the problem? Not sure what else we can do here besides reverting the entire async system, I don't have the hardware to test this properly. And there's other complaints about updates taking too long, for which a more advanced system is probably needed.
Member

@brecht. Yes I think we need a more advanced system. But current situation is less good then just doing busy waiting. Mixing both seems to have very undesirable effects and doing a proper fix requires more time and testing then I currently have.

I would just rip out the whole async handling for now. Ill re-add it to my long term todo.

@brecht. Yes I think we need a more advanced system. But current situation is less good then just doing busy waiting. Mixing both seems to have very undesirable effects and doing a proper fix requires more time and testing then I currently have. I would just rip out the whole async handling for now. Ill re-add it to my long term todo.

This issue was referenced by 3b53fffb77

This issue was referenced by 3b53fffb7788e19a0d05b8549aadbadf49279ca2

Changed status from 'Open' to: 'Resolved'

Changed status from 'Open' to: 'Resolved'

Closed by commit 3b53fffb77.

Closed by commit 3b53fffb77.
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
7 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#40027
No description provided.