Reliable GPU hang on common Intel cards (Linux) #80458
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
27 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#80458
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
System Information
Operating system: Linux Fedora 32
Graphics card: UHD Graphics 620 (rev 07) Intel
Blender Version
Broken: 2.80.x, 2.90.0-x
Short description of error
Blender crashes the computer, freezing, dmesg command shows a GPU Hang error.
Exact steps for others to reproduce the error
Only with Intel GPU (Intel Corporation UHD Graphics 620 (rev 07) here)
Open the given blend file F8892221 (Thanks @HDMaster84 ) and activate lookdev or render view - the graphic server hangs.
I followed the dmesg output that says to create an issue on freedesktop gitlab: https://gitlab.freedesktop.org/drm/intel/-/issues/2422 (you can see my dmesg and gpu error) - anyway, thos problem only happens with Blender.
The problem started from 2.80 if I'm not wrong, not with others earlier versions.
Added subscriber: @Metal3d
#87229 was marked as duplicate of this issue
#76529 was marked as duplicate of this issue
#79330 was marked as duplicate of this issue
#77466 was marked as duplicate of this issue
#77923 was marked as duplicate of this issue
Added subscriber: @HDMaster84
Changed status from 'Needs Triage' to: 'Needs User Info'
Can you test if this file causes a hang for you? (just activate lookdev or rendered view)
gpuhangtest.blend
For me it reliably hangs my laptop. More info:
System Information
Operating system: Linux-5.8.0-1-amd64-x86_64-with-debian-bullseye-sid 64 Bits
Graphics card: Mesa Intel(R) UHD Graphics 620 (WHL GT2) Intel 4.6 (Core Profile) Mesa 20.1.7
Blender Version
Broken: version: 2.91.0 Alpha, branch: master, commit date: 2020-09-17 11:33, hash:
393b5a231f
Hi, sorry for the delay
Yes it crashes my laptop also. I didn't tried with my own build, I only try with official build at this time.
And it also crash with my own build (blender 2.90 and blender 2.91-alpha) - I will edit the report so.
Thanks for the axample blend file !
Official binary on Fedora 32 makes GPU hangto Fedora 32 makes GPU hangChanged status from 'Needs User Info' to: 'Confirmed'
Ok so I can confirm that this is a valid issue not just with my system. It's also not special to the common UHD Graphics 620, because I had other Intel cards hanging as well in the same way (I will provide Information about them when I get a chance to do it). No problems with Nvidia though.
What I dont know yet is if it is related to EEVEE or if it will also happen in normal viewport shading mode. I have a feeling it does also happen with only viewport shading but more rare and less controllable.
EDIT: It totally does crash with viewport shading if you just enter xray mode in the test file I provided. But you need to increase the array count to about 10000 first.
btw did you mean to write 2.83.5 instead of 2.85?
Also a better title would now be something like "Reliable GPU hang on common Intel cards (Linux)"
Added subscriber: @lichtwerk
Got no issues with Material Preview not XRay
In Rendered Viewport Shading mode, I can confirm a crash (at least in a Debug build), even without the Array modifier on the plane:
Fedora 32 makes GPU hangto GPU freeze/crash [Too many arguments to macro node_bsdf_principled_dielectric]Ah, that shader compilation error is because of
b248ec9776
[so that doesnt really have to do with the hang/freeze I guess, becauseb248ec9776
is a very recent commit from yesterday].So my assumption is that is has to with the Intel cards
GPU freeze/crash [Too many arguments to macro node_bsdf_principled_dielectric]to GPU freeze/crashAdded subscriber: @Jeroen-Bakker
This issue should be fixed by
940ef1a4e8
I tried the file on one more system and it did also hang in exactly the same way (hang is the correct term).
It also had KDE just like my first system (may or may not be related?).
System Information
Operating system: Linux-5.7.0-3-amd64-x86_64-with-glibc2.29 64 Bits
Graphics card: Mesa DRI Intel(R) HD Graphics 3000 (SNB GT2) Intel Open Source Technology Center 3.3 (Core Profile) Mesa 20.1.7
Blender Version
Broken: version: 2.83.5 (official debian version)
To be clear the issue from that comment was fixed, not the issue of this bug report.
Yes your right, the mentioned commit does not fix this issue.
The issue seems to be Linux specific. I tested it on Windows on following system and it didn't hang the system (although it froze blender in some instances and I had to stop it from the Task Manager, which is fine though)
System Information
Operating system: Windows-10-10.0.17763-SP0 64 Bits
Graphics card: Intel(R) UHD Graphics 620 Intel 4.5.0 - Build 26.20.100.6999
Blender Version
Broken: version: 2.90.0, branch: master, commit date: 2020-08-31 11:26, hash:
0330d1af29
I have got crashes and glitches on viewport when I do some animations...
Note that I don't have the previous message from
dmesg
output since several days (maybe an update that fixes something), that you can see here https://gitlab.freedesktop.org/drm/intel/-/issues/2422 - now, I have got a kernel error (I can see it going to a TTY to check)Note: to avoid rebooting, on a TTY,
sudo init 3
then return to the TTY andsudo init 5
to restart DM... losing done work, but faster to be back on my desktop manager...Checking the https://gitlab.freedesktop.org/drm/intel/-/issues/2380#note_611887 comment, it seems that someone found a shader line that is the starting point of the crash.
It seems (I didn't checked yet) that the commit
https://developer.blender.org/rBd712f1f83af881be536ec0d183b7d3025c172684
is where it starts to fail.
So https://developer.blender.org/rBd0ff3434cffa2e056e4f191ead21226f32ea8c15 is the "good one" to work.
First of all, thanks for the
sudo init
trick. I also get a different message in dmesg, might be the update to kernel version 5.8.0. The problem is unchanged though.I am currently checking the good commit.
It's kind of hard to check that commit since it is so old and all the libraries are updated, so it wont compile. A revert seems like a lot of work. I will move on to other things for now.
Also I wanted to mention #77466 here. This is not strictly a duplicate since this is about viewport and eevee while that one is about cycles, but the problem seems really similar.
I tested 2.80 though and it still had the same issue with this particular file, so the file might lead to a slighly different problem, than the regular random crashes.
I also have 5.8.8-200.fc32.x86_64 kernel version, didn't tried older version for a while but you might be right, maybe I'm wrong saying that it works with them. I will download 2.80 version and check again.
It can also be a problem with mesa or intel drivers... as I mentionned my bug report to freedesktop, that can be an issue from something outside Blender, even if it's the only one who crashes the system (for example, Blender is using something that is not used by others softwares).
See https://gitlab.freedesktop.org/drm/intel/-/issues/2380 where a lot of comments seems to speak about the same problem.
OK... tried with 2.80rc3... same problem. I see the texture rendeting and then GPU hangs. It's a bit better (2.90 crashes without any texture preview), but still crashes.
So, that should mean that intel drivers or mesa has a problem... Or... Blender is using something that only crashes on that GPU for any reason.
Also reported here: https://gitlab.freedesktop.org/drm/intel/-/issues/2192
So we're not alone... And unfortunately it seems that we will not be able to use Blender for a while on our laptops...
#80073 seems to report the same problem
GPU freeze/crashto Reliable GPU hang on common Intel cards (Linux)OK, I tried something to be sure.
I've installed Pantheon desktop
sudo dnf install "pantheon desktop"
and tried to open F8892221 .blend file. It's very slow, but no crash. I can switch to other file, use my terminal, and no GPU hang.So, I suspect something like an OOM on GPU while Gnome is using GPU.
I continue to test, maybe I'm wrong but it seems that I can use Blender using that Desktop Manager. I'm typing this message with F8892221 file opened, with material preview in progress - 10 minutes, no hang.
Forget my last comment, it hangs with Pantheon, very later than with Gnome, but it hangs.
Added subscriber: @AnsonX10
Are you able to restore functionality of the computer by switching to tty3 or something and restarting your display manager? It may be the same problem I'm having.
I have a HD 630 and it freezes my display manager seemingly randomly when navigating in a 3d viewport. The GPUhangtest.blend file doesn't immediately crash my display manager, but I guarantee it eventually would (it happens on all files I've tried).
It seems to happen much sooner when using X11 and takes much longer to happen when using Wayland (tested on Plasma+SDDM and Gnome+GDM) although it's likely not a time-based difference as much as a "chances of occurring" issue.
Interestingly, I've never had it happen when using an Nvidia GPU (using Nvidia drivers and X11)
Here are my specs:
Linux-5.8.10-arch1-1-x86_64-with-arch 64 Bits
Mesa Intel(R) HD Graphics 630 (KBL GT2) Intel 4.6 (Core Profile) Mesa 20.1.8
Blender 2.91.0 Alpha, branch: master, commit date: 2020-09-22 18:36, hash:
358a8e00bd
@AnsonX10 yes, as I said earlier, that's the case. I go to tty3 and do "init 3", then back one more time to tty3 and "init 5". This resets card's memory, so I can go back to DM (gdm here) but, of course, I lose my non saved work...
It happens less often with Pantheon desktop (I switched to it from Gnome several days ago, it uses Wayland but it seems that it is less heavy for GPU memory. So I can (with crossed fingers, several lucky charm and incantations) use 2.90.1 more than 30 minutes (for now). But I just downloaded Blender 2.90.1, and haven't made a lot of tess (I didn't opened the example blend file).
It seems that it is mesa or intel driver that is faulty. But, in case of, I leave this issue opened if one developper finds something.
I will of course give news about Panthon + Blender 2.91.1 if something goes wrong again.
Thanks for your feedback.
@Metal3d Oh ok. I missed that message, sorry. (I restart it differently)
At least in my case (SDDM), you actually still have full control of your computer. You just can't see it. So you can control-S to save your work, and even close all your programs properly before restarting your desktop manager. If you can still see your mouse cursor, you're able to click on things too, if needed.
I can confirm this same issue occurs in other programs too. So it's (probably) not a Blender-specific issue per se, but we shouldn't necessarily avoid looking into mitigations.
@AnsonX10 can you be a little bit more specific? I didn't have issues in other programs yet, but I don't use many 3D apps.
@HDMaster84 Well just now I was using FreeCAD on the same Arch system and it crashed my display manager in an identical manner. This led me to believe it could be a Mesa driver problem for Intel graphics. To be honest, I don't know much about how drivers work, so I could be placing the blame in the wrong place. It's even still possible that the causes are different.
Added subscribers: @Tooniis, @iss, @Newtron, @Heathcliff, @mano-wii, @AcidRain0, @ankitm
Added subscribers: @mattffly, @Yquux, @kursadk, @Cowhead, @fho, @TimothyDrewFilms, @CepheidVariable, @Tuxgirl, @luc2, @rafaeldomi, @hussam, @dniku, @jhidding
Added subscribers: @Lendo, @Ravidk, @aialt, @dfelinto, @Mainframed, @HooglyBoogly
Added subscribers: @BlastedPupil, @MasterF1, @Euan, @GAPsWorld, @Alaska
Removed subscriber: @mattffly
i have my passthrough GPU working on my hybrid intel CPU i7+GPU 620/AMD R7 GPU laptop. although it doesnt show the GPU selectable in the Opencl system properties, when i am using EEvEE, cycles GPU and Luxcore GPU it works on any Blender v2.80 to v2.92 was the following:
Fedora 32 with AMD Mesa drivers (volcanic islands)
Edit /etc/default/grubto have the following lines and information.GRUB_CMDLINE_LINUX="rhgb quiet splash amdgpu.dc=0 radeon.cik_support=0 amdgpu.cik_support=1"
the (.cik) is used above instead of (.cis) because mine is part of the volcanic islands
make sure permission on folder include root and you_user_profifle as having read+write permissions
then i tend to login in a command console and execute, run" DRI_PRIME=1 /home/User_profile/blender-2.92.0/blender ".
this works for me
Removed subscriber: @Alaska
Added subscriber: @dp-1
If you want hang only blender and not the Xorg completely - turn off acceleration in Xorg config:
/usr/share/X11/xorg.conf.d/20-intel.conf
This could breaks other programs (for me Google Earth doesn't show map entirely )
Added subscriber: @Tha_Hobbist
Added subscriber: @verblendet
me too:
gpuhangtest.blend hangs my laptop; blender frequently crushes after minutes of working( shaders=high risk, no crush while rendering yet)
Linux 5.10.0-4-amd64 #1 SMP Debian 5.10.19-1 (2021-03-02) x86_64 GNU/Linux Gnome 3.38.4 X11
Mesa Intel® HD Graphics 530 (SKL GT2)
NVIDIA Corporation GM107GLM [Quadro M2000M] (rev a2)
Bumblebee 3.2.1
No othe Software yet provoked a graphic hang
definitely try from the terminal either as user of sudo -i
DRI_PRIME=1 /home/User_profile/blender-2.92.0/blender
or
DRI_PRIME=0 /home/User_profile/blender-2.92.0/blender
owning an optimus laptop I use as workaround primusrun blender with no more hangs!! after 30hours of heavy testing. gpuhangtest no problem either :-)
Removed subscriber: @fho
Added subscriber: @melonmousegames
Removed subscriber: @Ravidk
Added subscriber: @vignette
In my past experiences, I have gotten a lot of GPU hangs in
dmesg
, freezing the screen and forcing me to either restart or kill the desktop environment. Both causing me to lose my unsaved work. After a bit of experimenting it seems (or at least in my case) Blender is unstable on theiris
GPU driver. I have been running Blender with forcedi965
driver and it hasn't crashed my entire GPU even once for the past 4 months of using this workaround.Running Blender with
MESA_LOADER_DRIVER_OVERRIDE=i965
makes me able to run thegpuhangtest.blend
with no issues. This can be a bit worrisome, since if I recall correctly, there was a proposal to drop thei965
driver from MESA's mainline.Just for the record, I'm dropping
system-info
reports for both theiris
andi965
drivers:system-info-iris.txt
system-info-i965.txt
Added subscriber: @TooMuchFun
Just a heads up. I have a laptop with intel UHD 630 and a dedicated Nvidia GeForce 1650 GPU on Windows 10, V. 10.01.19041.844 and I am having a similar issue.
Oh, and this has been happening with versions 2.90.0, 2.92.0 and including the latest nightly build: 2.93.0 (April/09/2021)
I try to open a file through the open file dialog and the application immediately hangs. This is very annoying. It will recover a few minutes later but if I attempt to do anything else it will crash.
Does anyone know of a fix??
@TooMuchFun You are describing something completely different. Please open a new bug report with all the required information (you need to specify a working version for these kind of reports).
Thank you for the direction, Henrik ;D
Removed subscriber: @Mainframed
Added subscriber: @ChristopherWebber
I am running:
and Blender version 2.92.0.
I can confirm that the test file crashes blender, and indeed my entire desktop, over here.
I've also had weird desktop-destroying issues while doing editing, etc. I haven't figured out if this is separate from issues in the above file.
I initially thought that the error was because of this bug, but it appears that the fix is indeed in 2.92.0 so I no longer think that's the source of the issue. (In fact, the entire file relevant to that patch seems to have been removed and replaced with some new infrastructure, so I dunno.)
Note that others in Guix-land have reported similar issues... I think people running intel cards over there is probably more common (due to them being some of the few cards with reasonably working FOSS drivers).
Added subscriber: @saturniidae
with i915 driver on Blender 2.93.0. I can replicate consistently with the test file, but I first encountered the problem on a significantly simpler model. I wasn't even sure if it was Blender at first until I checked the journal, since there was pretty much no lag until everything suddenly froze up.
From the looks of it, it's hanging, tries to recover, but then hangs again instantly, as I tend to get a few repetitions of these lines.
I've attached my best recreation of the file I initially froze on, though I haven't been able to recreate the freeze with it. The only real difference in this one should be the positioning of the corners, as I was adjusting them on the X and Y axes and don't know the exact positions they were in, though this is fairly close. screwhole_test_freeze.blend
Added subscriber: @ekaitz
I just replicated the issue with the test file of this post, right after it happened with the model I was working on. My error messages look a little bit different, but pretty similar:
Added subscriber: @scaled
This command fix gpu hang (run from root), but i don't know, where is drawback of changing this varriable. My GPU is HD Graphics 520.
Added subscriber: @mwoehlke
FYI, I think I'm experiencing the same bug with blender-3.2.1-2.fc36.x86_64. Happens frequently with Blender, can't say I've ever seen it happen with anything else (although Blender is probably by far the hardest my poor Intel Corporation UHD Graphics 620 gets hit, followed by FreeCAD).
I'm pretty much sure that they are all emitting this
[drm] GPU HANG
error message. Had this bug since forever, not only limited to blender. Likely a mesa issue, since it doesn't ever happen when I use blender on windows or my other graphics heavy programs (like FreeCAD, OurPaint, Plasticity etc).@xavierh I remember talking with you (?) about this a while ago in blender.chat, is there any ways I could help debug this further? I'd really like to see this solved.
@ChengduLittleA :
in case the hang is due to work taking too long, extend the timeouts:
Then for further debugging:
Thanks! @xavierh , In case of
# after reproducing the hang:
, my system typically goes into complete freeze that has to force shut down, I'll check if it also generates log this way.can you still SSH into your system from another one to debug or even the rest of the OS, beyond the graphics, is failing ?
@xavierh at that time it's not responding to anything, maybe ssh is still running, I'll try tomorrow. I recall audio will also get stuck so it's likely that the entire machine just crashes. (Not sure if that's the case with other UHD620 devices, mine is a surface pro 6)
Hi, I'm facing this odd bug too. I was able to capture error logs as requested by @xavierh above #80458 (comment) . File attached.
I don't know if this is relevant, but whenever i use my laptop running Linux, i can use the GPU Nvidia RTX 4050 once to render and then all subsequent renders have the error
" Illegal address in CUDA queue copy_from_device (integrator_init_from_bake integrator_shade_surface_mnee integrator_sorted_paths_array)
Error: Illegal address in CUDA queue copy_from_device (integrator_init_from_bake integrator_shade_surface_mnee integrator_sorted_paths_array)"
or
"Failed to create CUDA context (Illegal address)"
there after
If switch back to CPU do a couple of renders and exit out of Blender i can on the next start up sometimes able to render in GPU once before hitting the same wall!
This issue is different. Please create a separate bug report for it.
@xavierh hi! Did you have the chance to look at my debug logs? #80458 (comment)
thanks for collecting it.
It sounds driver/HW related and specific to Linux, debugging it further is a bit out of my expertise, can you open a bug with these logs attached, following https://drm.pages.freedesktop.org/intel-docs/how-to-file-i915-bugs.html ?