Blender 2.79 vs 2.8 - Selection is significantly slower in production scenes #62511

Closed
opened 2019-03-12 22:26:57 +01:00 by Carlo Andreacchio · 27 comments

System Information
Operating system: Ubuntu 18.04
Graphics card: GTX 1080

Blender Version
Broken: 41cb565880

Short description of error
Viewport Selection is significantly slower in production scenes

Exact steps for others to reproduce the error

  1. Open attached blend file in blender 2.79
  2. select the different cubes
  3. feel how snappy it is
  4. switch to blender 2.8
  5. select the different cubes
  6. notice how it lags.

slowdown-2.blend

**System Information** Operating system: Ubuntu 18.04 Graphics card: GTX 1080 **Blender Version** Broken: 41cb5658803bf3b96f18e93c74c6af66ecdb1e83 **Short description of error** Viewport Selection is significantly slower in production scenes **Exact steps for others to reproduce the error** 1. Open attached blend file in blender 2.79 2. select the different cubes 3. feel how snappy it is 4. switch to blender 2.8 5. select the different cubes 6. notice how it lags. [slowdown-2.blend](https://archive.blender.org/developer/F6811442/slowdown-2.blend)

Added subscriber: @candreacchio

Added subscriber: @candreacchio

Added subscriber: @ZedDB

Added subscriber: @ZedDB
Clément Foucault was assigned by Sebastian Parborg 2019-03-13 13:56:38 +01:00

I'm guessing that 2.79 is better at instancing particle geometry. (2.79 also only reports the number of faces on the original particle object while 2.8 reports that the scene has 3,369,273,280 tris).

I'm guessing that 2.79 is better at instancing particle geometry. (2.79 also only reports the number of faces on the original particle object while 2.8 reports that the scene has 3,369,273,280 tris).

I should note that the tris in the scene is only part of the actual production scene we had... it actually is over 100,000,000,000 tris... This is not uncommon for us as most of those tris are heavily instanced and the amount of GPU ram required to render the scene was around 2.5gb

We use blender for infrastructure visualisation, such as roads / bridges... which span a very large area. This (and #62512) are both created to help make blender 2.8 usable for our workflow.

I should note that the tris in the scene is only part of the actual production scene we had... it actually is over 100,000,000,000 tris... This is not uncommon for us as most of those tris are heavily instanced and the amount of GPU ram required to render the scene was around 2.5gb We use blender for infrastructure visualisation, such as roads / bridges... which span a very large area. This (and #62512) are both created to help make blender 2.8 usable for our workflow.
Clément Foucault was unassigned by Brecht Van Lommel 2019-03-26 14:02:12 +01:00
Jeroen Bakker was assigned by Brecht Van Lommel 2019-03-26 14:02:12 +01:00

Added subscriber: @fclem

Added subscriber: @fclem

Added subscriber: @GavinScott

Added subscriber: @GavinScott
Jeroen Bakker removed their assignment 2019-04-03 12:30:00 +02:00
Clément Foucault was assigned by Jeroen Bakker 2019-04-03 12:30:00 +02:00
Member

Added subscriber: @Jeroen-Bakker

Added subscriber: @Jeroen-Bakker
Member

Seems to most time is spend in drawing the face wires (face_wireframe_pass). There is a huge performance impact between regular drawing and select drawing for the face_wireframe_pass.
Also select drawing uses a geometry shader what is slow. But even when disabling the geometry shader there is a huge performance difference (4x slower).
@fclem do you have an idea?

Seems to most time is spend in drawing the face wires (face_wireframe_pass). There is a huge performance impact between regular drawing and select drawing for the face_wireframe_pass. Also select drawing uses a geometry shader what is slow. But even when disabling the geometry shader there is a huge performance difference (4x slower). @fclem do you have an idea?

Major bottleneck is because of the use of ALGO_GL_PICK instead of ALGO_GL_QUERY. I don't know why it was changed (and even removed from the option menu?) but it leads to a major performance drop.

After that there is still a performance difference in this case because the scene is still > 4 time slower to draw in 2.8 and (in my tests) selecting with ALGO_GL_QUERY is more or less 4 time slower in 2.8.

Major bottleneck is because of the use of ALGO_GL_PICK instead of ALGO_GL_QUERY. I don't know why it was changed (and even removed from the option menu?) but it leads to a major performance drop. After that there is still a performance difference in this case because the scene is still > 4 time slower to draw in 2.8 and (in my tests) selecting with ALGO_GL_QUERY is more or less 4 time slower in 2.8.

Added subscriber: @brecht

Added subscriber: @brecht

OpenGL Depth Picking was enabled by default, see #59155. The option to turn it off is still there.

Basically everyone at the studio here was alreadying using it, selection without it is too unreliable.

Maybe there's a way to make it faster? I'm not familiar with the implementation details.

OpenGL Depth Picking was enabled by default, see #59155. The option to turn it off is still there. Basically everyone at the studio here was alreadying using it, selection without it is too unreliable. Maybe there's a way to make it faster? I'm not familiar with the implementation details.

Basically it uses glReadPixels for every drawcalls which create a GPU-CPU bubble. This scene has many instances and they are treated as individual objects (hence the slowdown).

Basically it uses glReadPixels for every drawcalls which create a GPU-CPU bubble. This scene has many instances and they are treated as individual objects (hence the slowdown).

Ok, that's pretty terrible. There must be a better algorithm we can find that takes into account depth without doing that for every draw call.

Ok, that's pretty terrible. There must be a better algorithm we can find that takes into account depth without doing that for every draw call.

Added subscriber: @Astiero

Added subscriber: @Astiero

Added subscriber: @ideasman42

Added subscriber: @ideasman42

Quickly checked on this and it looks like culling isn't working properly.

There is only a very small region so it should only be reading the buffer for objects directly under the cursor.

Looking in draw_shgroup and it's not culling objects, so all objects in the 3d view cause buffer reads, even if they aren't near the mouse cursor.


@brecht suggests that picking could draw color ID's, clicks after the first hit would skip drawing the previously selected objects - which would allow cycling objects behind the first hit.

This would be much faster but not work for showing objects in the menu, which could still use depth picking.

Quickly checked on this and it looks like culling isn't working properly. There is only a very small region so it should only be reading the buffer for objects directly under the cursor. Looking in `draw_shgroup` and it's not culling objects, so all objects in the 3d view cause buffer reads, even if they aren't near the mouse cursor. ---- @brecht suggests that picking could draw color ID's, clicks after the first hit would skip drawing the previously selected objects - which would allow cycling objects behind the first hit. This would be much faster but not work for showing objects in the menu, which could still use depth picking.

@ideasman42 I propose this workaround but i'm not 100% sure it will have no impact on all selection modes.

diff --git a/source/blender/gpu/intern/gpu_select_pick.c b/source/blender/gpu/intern/gpu_select_pick.c
index 6c3e05912b0..fc9d0eb6af3 100644
--- a/source/blender/gpu/intern/gpu_select_pick.c
+++ b/source/blender/gpu/intern/gpu_select_pick.c
@@ -479,7 +479,14 @@ static void gpu_select_load_id_pass_nearest(const DepthBufCache *rect_prev,
 bool gpu_select_pick_load_id(uint id)
 {
   GPUPickState *ps = &g_pick_state;
+
   if (ps->gl.is_init) {
+    if (id == ps->gl.prev_id) {
+      /* No need to read if we are still drawing for the same id
+       * since all these depth will be merged / deduplicated in the end. */
+      return true;
+    }
+
     const uint rect_len = ps->src.rect_len;
     glReadPixels(UNPACK4(ps->gl.clip_readpixels),
                  GL_DEPTH_COMPONENT,

That said I fixed the culling issue in 905f2d84. The other issue regarding culling in this scene is that the instances doesn't seems to get culled.

@ideasman42 I propose this workaround but i'm not 100% sure it will have no impact on all selection modes. ``` diff --git a/source/blender/gpu/intern/gpu_select_pick.c b/source/blender/gpu/intern/gpu_select_pick.c index 6c3e05912b0..fc9d0eb6af3 100644 --- a/source/blender/gpu/intern/gpu_select_pick.c +++ b/source/blender/gpu/intern/gpu_select_pick.c @@ -479,7 +479,14 @@ static void gpu_select_load_id_pass_nearest(const DepthBufCache *rect_prev, bool gpu_select_pick_load_id(uint id) { GPUPickState *ps = &g_pick_state; + if (ps->gl.is_init) { + if (id == ps->gl.prev_id) { + /* No need to read if we are still drawing for the same id + * since all these depth will be merged / deduplicated in the end. */ + return true; + } + const uint rect_len = ps->src.rect_len; glReadPixels(UNPACK4(ps->gl.clip_readpixels), GL_DEPTH_COMPONENT, ``` That said I fixed the culling issue in 905f2d84. The other issue regarding culling in this scene is that the instances doesn't seems to get culled.

So just to be clear, this is an optimization for instancing case where you can have multiple objects with the same selection id drawn one after another. It looks correct to me.

I guess it's a workaround in the sense that it's a good optimization for the current algorithm, but the entire algorithm would have to be different to support efficient selection of many individual objects. No reason not to commit it now I think.

So just to be clear, this is an optimization for instancing case where you can have multiple objects with the same selection id drawn one after another. It looks correct to me. I guess it's a workaround in the sense that it's a good optimization for the current algorithm, but the entire algorithm would have to be different to support efficient selection of many individual objects. No reason not to commit it now I think.

This issue was referenced by 86914e7133

This issue was referenced by 86914e713347082aed8d77b663a02068c03b6313

Changed status from 'Open' to: 'Resolved'

Changed status from 'Open' to: 'Resolved'

Much better! But It's still something like 1,000 times slower than 2.79 in the example scene provided, resulting in ten seconds of 100% GPU utilization to switch selection between the two cubes, and then grabbing one and moving it around is similarly slow, where the operations are almost instantaneous in 2.79.

I wouldn't call this issue completely resolved.

In the examples I've seen the issue has always been in scenes with something like an instanced particle system that results in on the order of 10,000 "objects" in the scene. Is there an opportunity to optionally filter such things out in selection, like adding them to the Object Types Visibility filters somehow?

Much better! But It's still something like 1,000 times slower than 2.79 in the example scene provided, resulting in ten seconds of 100% GPU utilization to switch selection between the two cubes, and then grabbing one and moving it around is similarly slow, where the operations are almost instantaneous in 2.79. I wouldn't call this issue completely resolved. In the examples I've seen the issue has always been in scenes with something like an instanced particle system that results in on the order of 10,000 "objects" in the scene. Is there an opportunity to optionally filter such things out in selection, like adding them to the Object Types Visibility filters somehow?

Changed status from 'Resolved' to: 'Open'

Changed status from 'Resolved' to: 'Open'

Thanks for all the developer work so far, It has improved, but not to the level that it should be.

I would like you to test with the slowdown scene attached and see if this scene is acceptable by your standards for artists to work with. Compare this to how 2.79 interacts.

This scene, was based off a production scene and isolated to just the particle system. There was about 3-5x the complexity with the full scene.

I am sorry that I have reopened the bug, but with the high impact it has to workflow, it is not appropriate to call this done.

Thanks for all the developer work so far, It has improved, but not to the level that it should be. I would like you to test with the slowdown scene attached and see if this scene is acceptable by your standards for artists to work with. Compare this to how 2.79 interacts. This scene, was based off a production scene and isolated to just the particle system. There was about 3-5x the complexity with the full scene. I am sorry that I have reopened the bug, but with the high impact it has to workflow, it is not appropriate to call this done.

@GavinScott reported on blender chat that recent changes to draw manager did improve the situation in his case. @candreacchio can you confirm it has?

@GavinScott reported on blender chat that recent changes to draw manager did improve the situation in his case. @candreacchio can you confirm it has?

Added subscriber: @SteffenD

Added subscriber: @SteffenD

Changed status from 'Open' to: 'Resolved'

Changed status from 'Open' to: 'Resolved'

Can confirm the fix on my machine, seems like it is back to 2.79 levels. Thanks for all your help!

Can confirm the fix on my machine, seems like it is back to 2.79 levels. Thanks for all your help!
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
10 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#62511
No description provided.