EEVEE Next: Blender freezing when render mode is selected #114597

Closed
opened 2023-11-07 19:59:26 +01:00 by Narayan Aras · 26 comments

System Information

Operating system: Graphics card: Owner: Report:
Windows-10-10.0.22621-SP0 64 Bits Intel UHD Graphics 620 Intel 4.5.0 - Build 25.20.100.6444 @raindrops #114597
Windows-10-10.0.22621-SP0 Intel Arc A770 16GB '4.6.0 - Build 31.0.101.4972' @Viktor_smg #115942
Linux, Windows AMD Integrated GPUs @Jeroen-Bakker

CPU: i3-8130U, with 12 GB RAM
Laptop: Aspire A315-53
Operating system: Win 11 Home

Blender Version
Broken: version: 4.1.0 Alpha, branch: main, commit date: 2023-11-07 11:49, hash: 869372ffc335
Worked: version: 4.1.0 Alpha, commit date: 2023-10-08 17:52, hash: a3831fe7af80

Short description of error
If I select EEVEE Next, and select the Render mode, even the default cube scene "almost" freezes. Blender responds to even the smallest input only after a few seconds.
(e.g. panning, zooming, etc.)

Video, with throbber (to indicate the freeze) and sound included:

Exact steps for others to reproduce the error

  1. Launch Blender or open the attached eeveenexttest.blend file
  2. Select EEVEE (earlier named as ""EEVEE Next")
  3. Select Rendered mode.

That's it.

**System Information** |Operating system:|Graphics card:|Owner:|Report:| | -- | -- | -- | -- | | Windows-10-10.0.22621-SP0 64 Bits| Intel UHD Graphics 620 Intel 4.5.0 - Build 25.20.100.6444 | @raindrops | #114597 | | Windows-10-10.0.22621-SP0 | Intel Arc A770 16GB '4.6.0 - Build 31.0.101.4972' | @Viktor_smg | #115942 | | Linux, Windows | AMD Integrated GPUs | @Jeroen-Bakker | CPU: i3-8130U, with 12 GB RAM Laptop: Aspire A315-53 Operating system: Win 11 Home **Blender Version** Broken: version: 4.1.0 Alpha, branch: main, commit date: 2023-11-07 11:49, hash: `869372ffc335` Worked: version: 4.1.0 Alpha, commit date: 2023-10-08 17:52, hash: `a3831fe7af80` **Short description of error** If I select EEVEE Next, and select the Render mode, even the default cube scene "almost" freezes. Blender responds to even the smallest input only after a few seconds. (e.g. panning, zooming, etc.) Video, with throbber (to indicate the freeze) and sound included: <video src="/attachments/2c001ff4-09b9-4d03-8230-019bb47e5d52" title="2023-12-08_15-39-19.mkv" controls></video> **Exact steps for others to reproduce the error** 1. Launch Blender or open the attached [eeveenexttest.blend](/attachments/65f57c07-f2e2-4c96-ba9a-82594b06be77) file 2. Select EEVEE (earlier named as ""EEVEE Next") 3. Select **Rendered** mode. That's it.
Narayan Aras added the
Status
Needs Triage
Priority
Normal
Type
Report
labels 2023-11-07 19:59:26 +01:00
Pratik Borhade changed title from Blender slows down to almost freezing when EEVEE Next (now renamed to "EEVEE") is selected and Render mode is selected to EEVEE Next: Blender freezing when render mode is selected 2023-11-08 05:41:10 +01:00
Member

Hi, thanks for the report. I'm able to confirm this occasionally but this may not be a bug. I suspect either shaders are compiling or render initialization caused it.
cc @Jeroen-Bakker

Hi, thanks for the report. I'm able to confirm this occasionally but this may not be a bug. I suspect either shaders are compiling or render initialization caused it. cc @Jeroen-Bakker
Author

IMHO there is no delay in (a) compiling shaders or (b) render initialization, because the status bar does not reflect that some task is in progress. Secondly, I waited for a few minutes after starting Blender to let all background processes finish, but that did not help.

I have restarted the laptop and even reinstalled Blender, but the issue is consistent.

Multiple versions of 4.1 Alpha have the same problem (I check twice a week).
So this is unlikely to be due to a compilation error.

IMHO there is no delay in (a) compiling shaders or (b) render initialization, because the status bar does not reflect that some task is in progress. Secondly, I waited for a few minutes after starting Blender to let all background processes finish, but that did not help. I have restarted the laptop and even reinstalled Blender, but the issue is consistent. Multiple versions of 4.1 Alpha have the same problem (I check twice a week). So this is unlikely to be due to a compilation error.
Blender Bot added
Status
Archived
and removed
Status
Needs Info from Developers
labels 2023-11-08 07:21:33 +01:00
Blender Bot added
Status
Needs Triage
and removed
Status
Archived
labels 2023-11-08 07:22:09 +01:00
Author

I had closed the bug by mistake (clicked on the wrong button).
Reopened.

I had closed the bug by mistake (clicked on the wrong button). Reopened.
Pratik Borhade added
Status
Needs Info from Developers
and removed
Status
Needs Triage
labels 2023-11-08 07:25:21 +01:00
Member

I can reproduce on my system. EEVEE-Next is not optimized for these kind of devices. We are looking into solutions to ensure it will be workable even on these low end devices. I got around 1 fps when using EEVEE-Next. Using Linux stalls Blender.

Would need to do more research in order to find the real cause and potential knobs to turn. I expect that the iGPUs are not very fast in compute tasks.

I can reproduce on my system. EEVEE-Next is not optimized for these kind of devices. We are looking into solutions to ensure it will be workable even on these low end devices. I got around 1 fps when using EEVEE-Next. Using Linux stalls Blender. Would need to do more research in order to find the real cause and potential knobs to turn. I expect that the iGPUs are not very fast in compute tasks.
Jeroen Bakker added
Status
Confirmed
and removed
Status
Needs Info from Developers
labels 2023-11-08 10:07:20 +01:00
Jeroen Bakker added this to the EEVEE & Viewport project 2023-11-08 10:07:29 +01:00
Author

Great!

So, if there are any debug versions to be run, please do let me know.
Also, I can provide data such as resources needed for a thread.

Thanks!

Great! So, if there are any debug versions to be run, please do let me know. Also, I can provide data such as resources needed for a thread. Thanks!

The original description of this issue stated that Blender "responded slowly" and took a few seconds. The way the author seems to have meant it, it "almost" freezes in the sense that it is very laggy. Jeroen wrote above that he gets 1 FPS and stalls on Linux.

In the issue I have, Blender responds immediately as shown in the video when playing sound, has 0 FPS, it produces no new image whatsoever, waiting does not produce something new, and it does not stall on Linux, it just flat out doesn't happen on Linux and everything works fine (at least on my Ubuntu).

The original description of this issue stated that Blender "responded slowly" and took a few seconds. The way the author seems to have meant it, it "almost" freezes in the sense that it is very laggy. Jeroen wrote above that he gets 1 FPS and stalls on Linux. In the issue I have, Blender responds immediately as shown in the video when playing sound, has 0 FPS, it produces no new image whatsoever, waiting does not produce something new, and it does not stall on Linux, it just flat out doesn't happen on Linux and everything works fine (at least on my Ubuntu).
Author

Actually, the report at the top is NOT my original version.
I did not post that video with a "throbber".
(Interesting idea! I'd love to know how to do it.)

In my own words, it would be too charitable to describe it as "laggy performance".
I'd describe it as "Blender comes to a grinding halt!"
Even for a minor input like panning and zooming, it freezes for a 10-20 seconds.
I cannot even think of doing anything simple like adding loop cuts.

No one is going to stare at the screen for hours and wait for something to break loose!

Anyway, in any bug's life, the original author's description hardly matters.
The investigators' remarks are more important.

In this case, Jeroen has already spotted the root cause: EEVEE is not optimized for basic hardware+OS.
He has already written that developers are working on optimizing EEVEE for low-end hardware.


My laptop is basic, with Windows 11 and a basic GPU chipset (Intel 620 UHD).
(In my original report, I had mentioned this prominently, but that info has vanished.)

People with Linux and better GPUs would have a better experience.
But we should stay focused on the [Windows+low-end hardware] combination.


[OT] In the response trail, some people have mentioned FPS and throbber.
How do we use these diagnostic tools?
It would be great if an online guide is available, so that we can use these tools effectively!

Actually, the report at the top is NOT my original version. I did not post that video with a "throbber". (Interesting idea! I'd love to know how to do it.) In my own words, it would be too charitable to describe it as "laggy performance". I'd describe it as _"Blender comes to a grinding halt!"_ Even for a minor input like panning and zooming, it freezes for a 10-20 seconds. I cannot even think of doing anything simple like adding loop cuts. No one is going to stare at the screen for hours and wait for something to break loose! Anyway, in any bug's life, the original author's description hardly matters. The investigators' remarks are more important. In this case, Jeroen has already spotted the root cause: **EEVEE is not optimized for basic hardware+OS**. He has already written that developers are working on optimizing EEVEE for low-end hardware. ******** My laptop is basic, with Windows 11 and a basic GPU chipset (Intel 620 UHD). (In my original report, I had mentioned this prominently, but that info has vanished.) People with Linux and better GPUs would have a better experience. But we should stay focused on the [Windows+low-end hardware] combination. ********* [OT] In the response trail, some people have mentioned FPS and throbber. How do we use these diagnostic tools? It would be great if an online guide is available, so that we can use these tools effectively!

My issue was closed and bits of it inserted into yours. This is likely because on the surface they appear similar (both have "freezing"). They are completely different issues. You are adding more information about how they are different. I was hoping Germano would take a second look at this issue, especially since I can't seem to mention him.

My issue is not a performance issue. As I said, for me everything works fine on Linux, and on Windows Blender does NOT perform (particularly) poorly. It responds to input within the same frame of the video, as you can see from the mouse overlay and simultaneously hear. And my issue has specific conditions to trigger it, it needs a specific scene, it needs a specific viewport size without resizing. There is no 1FPS big lag when those conditions are not met, like making a small viewport then resizing it to be bigger. Yours is a performance issue, you say Blender takes seconds to respond, and Jeroen explicitly notes low performance.

The A770 is not low-end hardware. I'm sure anyone would do a double take if they were putting a Vega 5 and RX 6700 XT in the same issue about having 1 FPS. Let's do a double take about putting the A770 next to the UHD 620.

A throbber is the looping animation that signifies loading or some such action, without indicating progress. It can be a spinning circle, or something that throbs a bit more literally, like three dots that shrink and grow in size. https://en.wikipedia.org/wiki/Throbber
I have included a throbber because the whole OS freezes for a few seconds, once. Not 1 FPS, not "it responds after 15-20 seconds" or anything like that. A video from the PC might not be the most accurate way to tell, but I'm not a liar at a murder trial. I say the OS freezes and it does.
This is because there was already an issue similar to this (OS freezing, not strictly low performance) which I did not report here (https://github.com/IGCIT/Intel-GPU-Community-Issue-Tracker-IGCIT/issues/414) and which was nonetheless fixed (#113447).

My issue was closed and bits of it inserted into yours. This is likely because on the surface they appear similar (both have "freezing"). They are completely different issues. You are adding more information about how they are different. I was hoping Germano would take a second look at this issue, especially since I can't seem to mention him. My issue is not a performance issue. As I said, for me everything works fine on Linux, and on Windows Blender does NOT perform (particularly) poorly. It responds to input within the same frame of the video, as you can see from the mouse overlay and simultaneously hear. And my issue has specific conditions to trigger it, it needs a specific scene, it needs a specific viewport size without resizing. There is no 1FPS big lag when those conditions are not met, like making a small viewport then resizing it to be bigger. Yours is a performance issue, you say Blender takes seconds to respond, and Jeroen explicitly notes low performance. The A770 is not low-end hardware. I'm sure anyone would do a double take if they were putting a Vega 5 and RX 6700 XT in the same issue about having 1 FPS. Let's do a double take about putting the A770 next to the UHD 620. A throbber is the looping animation that signifies loading or some such action, without indicating progress. It can be a spinning circle, or something that throbs a bit more literally, like three dots that shrink and grow in size. https://en.wikipedia.org/wiki/Throbber I have included a throbber because the whole OS freezes for a few seconds, once. Not 1 FPS, not "it responds after 15-20 seconds" or anything like that. A video from the PC might not be the most accurate way to tell, but I'm not a liar at a murder trial. I say the OS freezes and it does. This is because there was already an issue similar to this (OS freezing, not strictly low performance) which I did not report here (https://github.com/IGCIT/Intel-GPU-Community-Issue-Tracker-IGCIT/issues/414) and which was nonetheless fixed (https://projects.blender.org/blender/blender/pulls/113447).
Member

@Viktor_smg I agree your issue is not related and should have been kept open. I will reopen it

@Viktor_smg I agree your issue is not related and should have been kept open. I will reopen it
Member

@brecht @fclem I will be change the priority of this issue to high, but IMO it is a road-blocker for EEVEE-Next enablement in 4.1

At the time the issue was raised there were multiple performance bottlenecks in EEVEE. Most of them have been fixed, and as normally happens bottlenecks shift. Last week I came to the conclusion that these kind of devices do not perform well when using compute shaders. Although Compute shaders were added to OpenGL to improve performance, it seems like vendors of iGPU expected the workloads to be mainly done on the CPU. (perhaps related to the performance we saw when using OpenCL on those devices).

Most time is currently spent in Horizon scan and reflection tracing. Figures measured on AMD iGPU: HorizonScan takes 27ms*2, Trace.Reflection 11ms Horizon scan.denoise 10ms. This figures are per sample. Figures for Intel iGPUs still needs to be collected. I also need to validate if this is also the case for Metal (I expect it is). But due to issues in the Metal backend I haven't been able to validate. (#116216, #116414, #116128)

I do think this is a show blocker as it is a common platform for people that are just introduced in the world of CGI. Raising the bar to discrete GPU would limit this to richer countries and families who have money to spare. EEVEE is very often the reason why people stick around and perhaps chose a career around CGI. Limiting the use of EEVEE to solely discrete GPUs will make the gap bigger, Many schools around the world are currently teaching Blender using similar devices.

I do think that we should look for a solution.

  1. Don't support these devices (I won't vote for this due to the reasons I mentioned. I think this is important for Blender)
  2. Keep EEVEE-Legacy around until we have a solution (will confuse users, so also not ideal)
  3. Find a work-around for these devices (preferable, but there is a timing issue if we are able to find and implement one).
  4. Post-pone EEVEE-Next to give more time to find a work-around.
  5. ...

@fclem do you have a work-around in mind? I can reproduce the issue locally and could help with the implementation

@brecht @fclem I will be change the priority of this issue to high, but IMO it is a road-blocker for EEVEE-Next enablement in 4.1 At the time the issue was raised there were multiple performance bottlenecks in EEVEE. Most of them have been fixed, and as normally happens bottlenecks shift. Last week I came to the conclusion that these kind of devices do not perform well when using compute shaders. Although Compute shaders were added to OpenGL to improve performance, it seems like vendors of iGPU expected the workloads to be mainly done on the CPU. (perhaps related to the performance we saw when using OpenCL on those devices). Most time is currently spent in Horizon scan and reflection tracing. Figures measured on AMD iGPU: HorizonScan takes 27ms*2, Trace.Reflection 11ms Horizon scan.denoise 10ms. This figures are per sample. Figures for Intel iGPUs still needs to be collected. I also need to validate if this is also the case for Metal (I expect it is). But due to issues in the Metal backend I haven't been able to validate. (#116216, #116414, #116128) I do think this is a show blocker as it is a common platform for people that are just introduced in the world of CGI. Raising the bar to discrete GPU would limit this to richer countries and families who have money to spare. EEVEE is very often the reason why people stick around and perhaps chose a career around CGI. Limiting the use of EEVEE to solely discrete GPUs will make the gap bigger, Many schools around the world are currently teaching Blender using similar devices. I do think that we should look for a solution. 1. Don't support these devices (I won't vote for this due to the reasons I mentioned. I think this is important for Blender) 2. Keep EEVEE-Legacy around until we have a solution (will confuse users, so also not ideal) 3. Find a work-around for these devices (preferable, but there is a timing issue if we are able to find and implement one). 4. Post-pone EEVEE-Next to give more time to find a work-around. 5. ... @fclem do you have a work-around in mind? I can reproduce the issue locally and could help with the implementation
Author

I'd welcome options 2 and 3.

In option#2, if the old EEVEE is renamed "EEVEE-legacy", people will not be confused.

A few suggestions:

  1. There are a lot of YouTube videos where experts advise how to speed up Cycles and EEVEE by changing some settings.
    In the same vein, is it possible to find a combination of settings that allows EEVEE to work with low-end hardware?
  2. In such solutions, we have to open multiple tabs and change the values of several parameters.
    Instead, why not combine all of them and provide them as a preset?
    The idea is similar to how Blender offers "Fast" and "Best" options in some algorithms.
  3. An even better solution is to let Blender automatically optimize these settings based on system resources.
    A knowledgeable user can always override those default settings to tweak the performance.
    A newbie has no clue how to tweak various parameters based on his hardware capability. So he gets frustrated.
    So an "auto-adjust" feature would make his life much easier!
I'd welcome options 2 and 3. In option#2, if the old EEVEE is renamed "EEVEE-legacy", people will not be confused. A few suggestions: 1. There are a lot of YouTube videos where experts advise how to speed up Cycles and EEVEE by changing some settings. <br>In the same vein, is it possible to find a combination of settings that allows EEVEE to work with low-end hardware? 2. In such solutions, we have to open multiple tabs and change the values of several parameters. <br>Instead, why not combine all of them and provide them as a preset? <br>The idea is similar to how Blender offers "Fast" and "Best" options in some algorithms. 3. An even better solution is to let Blender automatically optimize these settings based on system resources.<br> A knowledgeable user can always override those default settings to tweak the performance. <br> A newbie has no clue how to tweak various parameters based on his hardware capability. So he gets frustrated. <br>So an "auto-adjust" feature would make his life much easier!
Blender Bot added
Status
Archived
and removed
Status
Confirmed
labels 2023-12-21 11:10:44 +01:00
Blender Bot added
Status
Needs Triage
and removed
Status
Archived
labels 2023-12-21 11:11:18 +01:00
Member

@raindrops thanks for thinking along, but in this case there aren't any options that change the performance as it isn't related to user focused features, but the used technology.

@raindrops thanks for thinking along, but in this case there aren't any options that change the performance as it isn't related to user focused features, but the used technology.
Jeroen Bakker added
Priority
High
Status
Confirmed
and removed
Priority
Normal
Status
Needs Triage
labels 2023-12-21 11:27:27 +01:00
Jeroen Bakker added this to the 4.1 milestone 2023-12-21 11:27:32 +01:00

@Jeroen-Bakker, thanks for the explanation.

Although Compute shaders were added to OpenGL to improve performance, it seems like vendors of iGPU expected the workloads to be mainly done on the CPU. (perhaps related to the performance we saw when using OpenCL on those devices).

To be clear, you are not saying these compute shaders are getting emulated on the CPU, right? Just that the hardware and/or drivers are not optimized for compute shaders?

I'm not familiar with the details of how compute shaders are implemented differently than fragment shaders. So please correct me where I'm wrong. I would expect that compute shaders run on the same cores at the same performance as fragment shaders, when doing equivalent operations.

Where I would expect performance issues is things like:

  • Slow atomics
  • High register pressure causing spilling
  • Poor occupancy if e.g. workgroup sizes are not fitted properly
  • Memory access patterns where coalesced memory access fails
  • Switching between compute and graphics mode not being efficient
  • Driver bugs

Is the impression that the equivalent code implemented as a fragment shader (if such a thing is possible) would run quickly, and that for whatever unknown reason there is a difference with compute? Or is it that these algorithms take too many clock cycles and memory accesses regardless, and the only way to get it fast would be some lower quality algorithm?

Are there any good profiling tools that can give us insight into these things? We have contacts at Intel and AMD as well, maybe they can give insights.

@Jeroen-Bakker, thanks for the explanation. > Although Compute shaders were added to OpenGL to improve performance, it seems like vendors of iGPU expected the workloads to be mainly done on the CPU. (perhaps related to the performance we saw when using OpenCL on those devices). To be clear, you are not saying these compute shaders are getting emulated on the CPU, right? Just that the hardware and/or drivers are not optimized for compute shaders? I'm not familiar with the details of how compute shaders are implemented differently than fragment shaders. So please correct me where I'm wrong. I would expect that compute shaders run on the same cores at the same performance as fragment shaders, when doing equivalent operations. Where I would expect performance issues is things like: * Slow atomics * High register pressure causing spilling * Poor occupancy if e.g. workgroup sizes are not fitted properly * Memory access patterns where coalesced memory access fails * Switching between compute and graphics mode not being efficient * Driver bugs Is the impression that the equivalent code implemented as a fragment shader (if such a thing is possible) would run quickly, and that for whatever unknown reason there is a difference with compute? Or is it that these algorithms take too many clock cycles and memory accesses regardless, and the only way to get it fast would be some lower quality algorithm? Are there any good profiling tools that can give us insight into these things? We have contacts at Intel and AMD as well, maybe they can give insights.
Author

@Jeroen-Bakker Thanks for the clarification!

BTW, I just noticed this: At the beginning, this issue notes that EEVEE Next works with version: 4.1.0 Alpha, commit date: 2023-10-08 17:52.

Just to clarify, I have not found EEVEE Next working in any version of Blender.
Neither had I mentioned this in my original bug.

On the other hand, if a working version is found, that's great news!
I'd like to try it and report if it really works with my system.
Please let me have the download link of that version.

Thanks in advance!

@Jeroen-Bakker Thanks for the clarification! BTW, I just noticed this: At the beginning, this issue notes that EEVEE Next works with version: 4.1.0 Alpha, commit date: 2023-10-08 17:52. Just to clarify, I have not found EEVEE Next working in any version of Blender. Neither had I mentioned this in my original bug. On the other hand, if a working version is found, that's great news! I'd like to try it and report if it really works with my system. Please let me have the download link of that version. Thanks in advance!
Member

@brecht

Yes, compute shaders are run on the iGPU and that the performance is related to the items in your list. A fragment shader can be used and would utilize more vendor specific optimizations, but adds some overhead and limitations due to being a graphics pipeline and the solution might require separate input/output buffers.

I would say atomics would be slow by design (limited cache, larger memory latency). Register spilling is something we take into account during implementation of the shaders. I can review these specific shaders and look into some utilizations. Different work group sizes (smaller, larger) doesn't change the performance (validated). Memory access patterns are vendor specific, but could be a cause.

Compute pipelines are synchronous, running multiple times the same shader would give an indication if this is actually the case.
At this time nothing points to driver bugs.

One thing we should also check is that due to limited physical space often the instruction set is minimized, perhaps int based instructions require specific alignment.

I will today check in more detail and ask Intel/AMD for advice.

@brecht Yes, compute shaders are run on the iGPU and that the performance is related to the items in your list. A fragment shader can be used and would utilize more vendor specific optimizations, but adds some overhead and limitations due to being a graphics pipeline and the solution might require separate input/output buffers. I would say atomics would be slow by design (limited cache, larger memory latency). Register spilling is something we take into account during implementation of the shaders. I can review these specific shaders and look into some utilizations. Different work group sizes (smaller, larger) doesn't change the performance (validated). Memory access patterns are vendor specific, but could be a cause. Compute pipelines are synchronous, running multiple times the same shader would give an indication if this is actually the case. At this time nothing points to driver bugs. One thing we should also check is that due to limited physical space often the instruction set is minimized, perhaps int based instructions require specific alignment. I will today check in more detail and ask Intel/AMD for advice.
Member

Results of today. Horizon scan diffuse shader requires 64/256 vector registries an 94/106 scalar registriEs in the inner loop.

OpenGL tooling is limited, so there are some assumptions in these figures. (Needed to use vulkan toolchain for this) .

I will continue with this after the holidays. Anyone is free to continue during my absence

Results of today. Horizon scan diffuse shader requires 64/256 vector registries an 94/106 scalar registriEs in the inner loop. OpenGL tooling is limited, so there are some assumptions in these figures. (Needed to use vulkan toolchain for this) . I will continue with this after the holidays. Anyone is free to continue during my absence

Thanks. Assuming you are talking about AMD RDNA1 architecture, it seems 64 vector registers (VGPRs) would give full occupancy? Using all 16 wave slots. It was not so clear to me what the scalar registers mean, if they also affect occupancy.

Thinking about it, I would expect register pressure and occupancy to be the same for integrated and discrete GPUs of the same architecture. If it's an issue on one it should also be on the other.

Thanks. Assuming you are talking about AMD RDNA1 architecture, it seems 64 vector registers (VGPRs) would give full occupancy? Using all 16 wave slots. It was not so clear to me what the scalar registers mean, if they also affect occupancy. Thinking about it, I would expect register pressure and occupancy to be the same for integrated and discrete GPUs of the same architecture. If it's an issue on one it should also be on the other.
Member

This was rdna3 iGPU. With 8 compute units. The units seems like similar to Descrete cu. And uses scalar registries to reduce pressure on vector and reduce alignment issues. What I found so far this is also a shared resource inside a cu.

The numbers are high and we need to look in the code eventually. But before I wanted To analyse on intel. But was still installing their toolset.

Locale storage in this specific case wasn’t used. So I assume wavefronts are reduced.

This would also be a bottleneck for discrete, but having more CUs would not make the bottleneck visible.

This was rdna3 iGPU. With 8 compute units. The units seems like similar to Descrete cu. And uses scalar registries to reduce pressure on vector and reduce alignment issues. What I found so far this is also a shared resource inside a cu. The numbers are high and we need to look in the code eventually. But before I wanted To analyse on intel. But was still installing their toolset. Locale storage in this specific case wasn’t used. So I assume wavefronts are reduced. This would also be a bottleneck for discrete, but having more CUs would not make the bottleneck visible.
Hans Goudey added
Type
Bug
and removed
Type
Report
labels 2024-01-15 14:07:30 +01:00
Brecht Van Lommel modified the milestone from 4.1 to 4.2 LTS 2024-01-24 20:22:55 +01:00
Member

Last few months we have been improving the performance of EEVEE-Next there are still 2 tasks open, before IMO we can close this issue. Any further optimizations will happen as regular development.

  • #118924 - Horizon scan optimization that is currently being developed.
  • #118903 - Allowing users to change the pixel size which is in review.

The main performance charistic difference between EEVEE and EEVEE-Next is that the startup overhead of EEVEE-Next is bigger. But the rendering time per object is smaller. For low end iGPUs there is always a visible performance penalty; however changing the pixel size for a lower res rendering enables users to change between performance and quality.

@raindrops what are your current frame-rate you get when displaying the default cube on a 1920x1080(-ish) display?

Last few months we have been improving the performance of EEVEE-Next there are still 2 tasks open, before IMO we can close this issue. Any further optimizations will happen as regular development. - #118924 - Horizon scan optimization that is currently being developed. - #118903 - Allowing users to change the pixel size which is in review. The main performance charistic difference between EEVEE and EEVEE-Next is that the startup overhead of EEVEE-Next is bigger. But the rendering time per object is smaller. For low end iGPUs there is always a visible performance penalty; however changing the pixel size for a lower res rendering enables users to change between performance and quality. @raindrops what are your current frame-rate you get when displaying the default cube on a 1920x1080(-ish) display?
Author

Which version of Blender should I install to get the latest code?
Please guide me. (Preferably, please share the link of the download page).

Thanks in advance!

Also, where is the frame rate displayed in the Blender UI?

Which version of Blender should I install to get the latest code? Please guide me. (Preferably, please share the link of the download page). Thanks in advance! Also, where is the frame rate displayed in the Blender UI?
Member

Currently I get 24 frames per second using: Mesa Intel(R) UHD Graphics 620 (KBL GT2) Intel 4.6 (Core Profile) Mesa 23.2.1-1ubuntu3.1~22.04.2 This is regular main without the improvements mentioned above. For regular EEVEE I get 45 fps. Which is around what I would expect the performance to be without the patches applied.

Latest daily build can be downloaded from https://builder.blender.org/download/daily/ make sure you download 4.2.0 alpha. When starting blender, switch to EEVEE-Next and press space. In the top left corner a fps counter will appear.

If you get similar results I believe we are on the right track! Thanks for testing

Currently I get 24 frames per second using: `Mesa Intel(R) UHD Graphics 620 (KBL GT2) Intel 4.6 (Core Profile) Mesa 23.2.1-1ubuntu3.1~22.04.2` This is regular main without the improvements mentioned above. For regular EEVEE I get 45 fps. Which is around what I would expect the performance to be without the patches applied. Latest daily build can be downloaded from https://builder.blender.org/download/daily/ make sure you download 4.2.0 alpha. When starting blender, switch to EEVEE-Next and press space. In the top left corner a fps counter will appear. If you get similar results I believe we are on the right track! Thanks for testing
Author

I installed 4.2.0 Alpha, build: 2024-03-08, 01:57:50

My system is as follows:

  • Win 11 Home running on 256 GB SSD
  • Acer Aspire 3 A315-53
  • CPU: i3 8230U
  • Intel UHD Graphics 620
  • RAM: 12 GB DDR4

Both EEVEE and EEVEE Next give 25 +/- 1 fps

For both render engines, Blender remains very responsive in Render mode.
No lag at all!

I installed 4.2.0 Alpha, build: 2024-03-08, 01:57:50 My system is as follows: - Win 11 Home running on 256 GB SSD - Acer Aspire 3 A315-53 - CPU: i3 8230U - Intel UHD Graphics 620 - RAM: 12 GB DDR4 Both EEVEE and EEVEE Next give 25 +/- 1 fps For both render engines, Blender remains very responsive in Render mode. No lag at all!
Author

Just to understand this better, is there is a benchmark value of fps?
(Which fps value is "good"?)

Just to understand this better, is there is a benchmark value of fps? (Which fps value is "good"?)
Member

FPS is platform dependent so what we normally do is that we compare it with a previous version of Blender to find out the differences. Or here compare it to EEVEE vs EEVEE-Next.

To show higher FPS in Blender, we normally change the Frame Rate setting.
image

FPS is platform dependent so what we normally do is that we compare it with a previous version of Blender to find out the differences. Or here compare it to EEVEE vs EEVEE-Next. To show higher FPS in Blender, we normally change the Frame Rate setting. ![image](/attachments/0f7f3182-94a3-4ceb-a40f-70325895e18b)

@Jeroen-Bakker, can this be closed now?

@Jeroen-Bakker, can this be closed now?
Member

Ah yes, all issues have been resolved.
And performance is comparable with EEVEE-classic sometimes even better.

Ah yes, all issues have been resolved. And performance is comparable with EEVEE-classic sometimes even better.
Blender Bot added
Status
Archived
and removed
Status
Confirmed
labels 2024-03-21 19:10:13 +01:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No Assignees
5 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#114597
No description provided.