EEVEE Next: Blender freezing when render mode is selected #114597
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
5 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#114597
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
System Information
CPU: i3-8130U, with 12 GB RAM
Laptop: Aspire A315-53
Operating system: Win 11 Home
Blender Version
Broken: version: 4.1.0 Alpha, branch: main, commit date: 2023-11-07 11:49, hash:
869372ffc335
Worked: version: 4.1.0 Alpha, commit date: 2023-10-08 17:52, hash:
a3831fe7af80
Short description of error
If I select EEVEE Next, and select the Render mode, even the default cube scene "almost" freezes. Blender responds to even the smallest input only after a few seconds.
(e.g. panning, zooming, etc.)
Video, with throbber (to indicate the freeze) and sound included:
Exact steps for others to reproduce the error
That's it.
Blender slows down to almost freezing when EEVEE Next (now renamed to "EEVEE") is selected and Render mode is selectedto EEVEE Next: Blender freezing when render mode is selectedHi, thanks for the report. I'm able to confirm this occasionally but this may not be a bug. I suspect either shaders are compiling or render initialization caused it.
cc @Jeroen-Bakker
IMHO there is no delay in (a) compiling shaders or (b) render initialization, because the status bar does not reflect that some task is in progress. Secondly, I waited for a few minutes after starting Blender to let all background processes finish, but that did not help.
I have restarted the laptop and even reinstalled Blender, but the issue is consistent.
Multiple versions of 4.1 Alpha have the same problem (I check twice a week).
So this is unlikely to be due to a compilation error.
I had closed the bug by mistake (clicked on the wrong button).
Reopened.
I can reproduce on my system. EEVEE-Next is not optimized for these kind of devices. We are looking into solutions to ensure it will be workable even on these low end devices. I got around 1 fps when using EEVEE-Next. Using Linux stalls Blender.
Would need to do more research in order to find the real cause and potential knobs to turn. I expect that the iGPUs are not very fast in compute tasks.
Great!
So, if there are any debug versions to be run, please do let me know.
Also, I can provide data such as resources needed for a thread.
Thanks!
The original description of this issue stated that Blender "responded slowly" and took a few seconds. The way the author seems to have meant it, it "almost" freezes in the sense that it is very laggy. Jeroen wrote above that he gets 1 FPS and stalls on Linux.
In the issue I have, Blender responds immediately as shown in the video when playing sound, has 0 FPS, it produces no new image whatsoever, waiting does not produce something new, and it does not stall on Linux, it just flat out doesn't happen on Linux and everything works fine (at least on my Ubuntu).
Actually, the report at the top is NOT my original version.
I did not post that video with a "throbber".
(Interesting idea! I'd love to know how to do it.)
In my own words, it would be too charitable to describe it as "laggy performance".
I'd describe it as "Blender comes to a grinding halt!"
Even for a minor input like panning and zooming, it freezes for a 10-20 seconds.
I cannot even think of doing anything simple like adding loop cuts.
No one is going to stare at the screen for hours and wait for something to break loose!
Anyway, in any bug's life, the original author's description hardly matters.
The investigators' remarks are more important.
In this case, Jeroen has already spotted the root cause: EEVEE is not optimized for basic hardware+OS.
He has already written that developers are working on optimizing EEVEE for low-end hardware.
My laptop is basic, with Windows 11 and a basic GPU chipset (Intel 620 UHD).
(In my original report, I had mentioned this prominently, but that info has vanished.)
People with Linux and better GPUs would have a better experience.
But we should stay focused on the [Windows+low-end hardware] combination.
[OT] In the response trail, some people have mentioned FPS and throbber.
How do we use these diagnostic tools?
It would be great if an online guide is available, so that we can use these tools effectively!
My issue was closed and bits of it inserted into yours. This is likely because on the surface they appear similar (both have "freezing"). They are completely different issues. You are adding more information about how they are different. I was hoping Germano would take a second look at this issue, especially since I can't seem to mention him.
My issue is not a performance issue. As I said, for me everything works fine on Linux, and on Windows Blender does NOT perform (particularly) poorly. It responds to input within the same frame of the video, as you can see from the mouse overlay and simultaneously hear. And my issue has specific conditions to trigger it, it needs a specific scene, it needs a specific viewport size without resizing. There is no 1FPS big lag when those conditions are not met, like making a small viewport then resizing it to be bigger. Yours is a performance issue, you say Blender takes seconds to respond, and Jeroen explicitly notes low performance.
The A770 is not low-end hardware. I'm sure anyone would do a double take if they were putting a Vega 5 and RX 6700 XT in the same issue about having 1 FPS. Let's do a double take about putting the A770 next to the UHD 620.
A throbber is the looping animation that signifies loading or some such action, without indicating progress. It can be a spinning circle, or something that throbs a bit more literally, like three dots that shrink and grow in size. https://en.wikipedia.org/wiki/Throbber
I have included a throbber because the whole OS freezes for a few seconds, once. Not 1 FPS, not "it responds after 15-20 seconds" or anything like that. A video from the PC might not be the most accurate way to tell, but I'm not a liar at a murder trial. I say the OS freezes and it does.
This is because there was already an issue similar to this (OS freezing, not strictly low performance) which I did not report here (https://github.com/IGCIT/Intel-GPU-Community-Issue-Tracker-IGCIT/issues/414) and which was nonetheless fixed (#113447).
@Viktor_smg I agree your issue is not related and should have been kept open. I will reopen it
@brecht @fclem I will be change the priority of this issue to high, but IMO it is a road-blocker for EEVEE-Next enablement in 4.1
At the time the issue was raised there were multiple performance bottlenecks in EEVEE. Most of them have been fixed, and as normally happens bottlenecks shift. Last week I came to the conclusion that these kind of devices do not perform well when using compute shaders. Although Compute shaders were added to OpenGL to improve performance, it seems like vendors of iGPU expected the workloads to be mainly done on the CPU. (perhaps related to the performance we saw when using OpenCL on those devices).
Most time is currently spent in Horizon scan and reflection tracing. Figures measured on AMD iGPU: HorizonScan takes 27ms*2, Trace.Reflection 11ms Horizon scan.denoise 10ms. This figures are per sample. Figures for Intel iGPUs still needs to be collected. I also need to validate if this is also the case for Metal (I expect it is). But due to issues in the Metal backend I haven't been able to validate. (#116216, #116414, #116128)
I do think this is a show blocker as it is a common platform for people that are just introduced in the world of CGI. Raising the bar to discrete GPU would limit this to richer countries and families who have money to spare. EEVEE is very often the reason why people stick around and perhaps chose a career around CGI. Limiting the use of EEVEE to solely discrete GPUs will make the gap bigger, Many schools around the world are currently teaching Blender using similar devices.
I do think that we should look for a solution.
@fclem do you have a work-around in mind? I can reproduce the issue locally and could help with the implementation
I'd welcome options 2 and 3.
In option#2, if the old EEVEE is renamed "EEVEE-legacy", people will not be confused.
A few suggestions:
In the same vein, is it possible to find a combination of settings that allows EEVEE to work with low-end hardware?
Instead, why not combine all of them and provide them as a preset?
The idea is similar to how Blender offers "Fast" and "Best" options in some algorithms.
A knowledgeable user can always override those default settings to tweak the performance.
A newbie has no clue how to tweak various parameters based on his hardware capability. So he gets frustrated.
So an "auto-adjust" feature would make his life much easier!
@raindrops thanks for thinking along, but in this case there aren't any options that change the performance as it isn't related to user focused features, but the used technology.
@Jeroen-Bakker, thanks for the explanation.
To be clear, you are not saying these compute shaders are getting emulated on the CPU, right? Just that the hardware and/or drivers are not optimized for compute shaders?
I'm not familiar with the details of how compute shaders are implemented differently than fragment shaders. So please correct me where I'm wrong. I would expect that compute shaders run on the same cores at the same performance as fragment shaders, when doing equivalent operations.
Where I would expect performance issues is things like:
Is the impression that the equivalent code implemented as a fragment shader (if such a thing is possible) would run quickly, and that for whatever unknown reason there is a difference with compute? Or is it that these algorithms take too many clock cycles and memory accesses regardless, and the only way to get it fast would be some lower quality algorithm?
Are there any good profiling tools that can give us insight into these things? We have contacts at Intel and AMD as well, maybe they can give insights.
@Jeroen-Bakker Thanks for the clarification!
BTW, I just noticed this: At the beginning, this issue notes that EEVEE Next works with version: 4.1.0 Alpha, commit date: 2023-10-08 17:52.
Just to clarify, I have not found EEVEE Next working in any version of Blender.
Neither had I mentioned this in my original bug.
On the other hand, if a working version is found, that's great news!
I'd like to try it and report if it really works with my system.
Please let me have the download link of that version.
Thanks in advance!
@brecht
Yes, compute shaders are run on the iGPU and that the performance is related to the items in your list. A fragment shader can be used and would utilize more vendor specific optimizations, but adds some overhead and limitations due to being a graphics pipeline and the solution might require separate input/output buffers.
I would say atomics would be slow by design (limited cache, larger memory latency). Register spilling is something we take into account during implementation of the shaders. I can review these specific shaders and look into some utilizations. Different work group sizes (smaller, larger) doesn't change the performance (validated). Memory access patterns are vendor specific, but could be a cause.
Compute pipelines are synchronous, running multiple times the same shader would give an indication if this is actually the case.
At this time nothing points to driver bugs.
One thing we should also check is that due to limited physical space often the instruction set is minimized, perhaps int based instructions require specific alignment.
I will today check in more detail and ask Intel/AMD for advice.
Results of today. Horizon scan diffuse shader requires 64/256 vector registries an 94/106 scalar registriEs in the inner loop.
OpenGL tooling is limited, so there are some assumptions in these figures. (Needed to use vulkan toolchain for this) .
I will continue with this after the holidays. Anyone is free to continue during my absence
Thanks. Assuming you are talking about AMD RDNA1 architecture, it seems 64 vector registers (VGPRs) would give full occupancy? Using all 16 wave slots. It was not so clear to me what the scalar registers mean, if they also affect occupancy.
Thinking about it, I would expect register pressure and occupancy to be the same for integrated and discrete GPUs of the same architecture. If it's an issue on one it should also be on the other.
This was rdna3 iGPU. With 8 compute units. The units seems like similar to Descrete cu. And uses scalar registries to reduce pressure on vector and reduce alignment issues. What I found so far this is also a shared resource inside a cu.
The numbers are high and we need to look in the code eventually. But before I wanted To analyse on intel. But was still installing their toolset.
Locale storage in this specific case wasn’t used. So I assume wavefronts are reduced.
This would also be a bottleneck for discrete, but having more CUs would not make the bottleneck visible.
Jeroen Bakker referenced this issue2024-03-07 10:22:32 +01:00
Last few months we have been improving the performance of EEVEE-Next there are still 2 tasks open, before IMO we can close this issue. Any further optimizations will happen as regular development.
The main performance charistic difference between EEVEE and EEVEE-Next is that the startup overhead of EEVEE-Next is bigger. But the rendering time per object is smaller. For low end iGPUs there is always a visible performance penalty; however changing the pixel size for a lower res rendering enables users to change between performance and quality.
@raindrops what are your current frame-rate you get when displaying the default cube on a 1920x1080(-ish) display?
Which version of Blender should I install to get the latest code?
Please guide me. (Preferably, please share the link of the download page).
Thanks in advance!
Also, where is the frame rate displayed in the Blender UI?
Currently I get 24 frames per second using:
Mesa Intel(R) UHD Graphics 620 (KBL GT2) Intel 4.6 (Core Profile) Mesa 23.2.1-1ubuntu3.1~22.04.2
This is regular main without the improvements mentioned above. For regular EEVEE I get 45 fps. Which is around what I would expect the performance to be without the patches applied.Latest daily build can be downloaded from https://builder.blender.org/download/daily/ make sure you download 4.2.0 alpha. When starting blender, switch to EEVEE-Next and press space. In the top left corner a fps counter will appear.
If you get similar results I believe we are on the right track! Thanks for testing
I installed 4.2.0 Alpha, build: 2024-03-08, 01:57:50
My system is as follows:
Both EEVEE and EEVEE Next give 25 +/- 1 fps
For both render engines, Blender remains very responsive in Render mode.
No lag at all!
Just to understand this better, is there is a benchmark value of fps?
(Which fps value is "good"?)
FPS is platform dependent so what we normally do is that we compare it with a previous version of Blender to find out the differences. Or here compare it to EEVEE vs EEVEE-Next.
To show higher FPS in Blender, we normally change the Frame Rate setting.
@Jeroen-Bakker, can this be closed now?
Ah yes, all issues have been resolved.
And performance is comparable with EEVEE-classic sometimes even better.