oneAPI Embree HWRT crash when using Ashikhmin-Shirley Glossy BSDF node #107356

Closed
opened 2023-04-26 02:14:28 +02:00 by sentharn · 4 comments

System Information
Operating system: Windows-10-10.0.22621-SP0 64 Bits (actually Win11)
Graphics card: Intel(R) Arc(TM) A770 Graphics Intel 4.5.0 - Build 31.0.101.4311

Blender Version
Broken: version: 3.6.0 Alpha, branch: main, commit date: 2023-04-25 19:59, hash: ae57d86d42c7
Worked: (newest version of Blender that worked as expected)

Short description of error
Blender hard-crashes with an unhandled exception from oneAPI with the following file (this asset is from Blenderkit), only if Render denoising with OpenImageDenoise is enabled and Embree GPU support is enabled.

When I was using this in a larger scene, I would sometimes see one sample before the crash, other times it would crash immediately after trying to render Sample 0. The last message I saw with this reduced test case was Loading Denoising Kernels.

Other types of Glossy BSDF (GCX, etc) work fine. I couldn't get it to happen when I added a Glossy node outside of the group it is in.

I need someone with an A770 (preferably 16GB edition) to see if they can repro this on a 3.6 alpha build. I don't have another Arc GPU to know if it's my card/environment or not. @xavierh attempted to repro on an A750 and could not cause the crash.

So far I've managed to cause this with Alpha builds from the build bot, an Alpha build with AOT kernels, win11 builds with AOT kernels, alpha builds with JIT kernels, and a build from another machine (win10) with AOT kernels.

Exact steps for others to reproduce the error
Open blend file
Make sure you have Render denoising with OpenImageDenoise enabled
Make sure you have Embree GPU support under the oneAPI tab in System Preferences enabled.
Render.

**System Information** Operating system: Windows-10-10.0.22621-SP0 64 Bits (actually Win11) Graphics card: Intel(R) Arc(TM) A770 Graphics Intel 4.5.0 - Build 31.0.101.4311 **Blender Version** Broken: version: 3.6.0 Alpha, branch: main, commit date: 2023-04-25 19:59, hash: `ae57d86d42c7` Worked: (newest version of Blender that worked as expected) **Short description of error** Blender hard-crashes with an unhandled exception from oneAPI with the following file (this asset is from Blenderkit), *only if* Render denoising with OpenImageDenoise is enabled *and* Embree GPU support is enabled. When I was using this in a larger scene, I would sometimes see one sample before the crash, other times it would crash immediately after trying to render Sample 0. The last message I saw with this reduced test case was Loading Denoising Kernels. Other types of Glossy BSDF (GCX, etc) work fine. I couldn't get it to happen when I added a Glossy node outside of the group it is in. I need someone with an A770 (preferably 16GB edition) to see if they can repro this on a 3.6 alpha build. I don't have another Arc GPU to know if it's my card/environment or not. @xavierh attempted to repro on an A750 and could not cause the crash. So far I've managed to cause this with Alpha builds from the build bot, an Alpha build with AOT kernels, win11 builds with AOT kernels, alpha builds with JIT kernels, and a build from another machine (win10) with AOT kernels. **Exact steps for others to reproduce the error** Open blend file Make sure you have Render denoising with OpenImageDenoise enabled Make sure you have Embree GPU support under the oneAPI tab in System Preferences enabled. Render.
sentharn added the
Priority
Normal
Type
Report
Status
Needs Triage
labels 2023-04-26 02:14:29 +02:00
Author

Note that as per #106266 you may see page faults/access violations when using a debugger...look closely as many of them you can "Continue" past. The above exception is unrecoverable.

Occasionally you might even get the display driver to crash and recover (or freeze). I am using driver version 4311.

The thread 0x7ba8 has exited with code 0 (0x0).
Exception thrown at 0x00007FFD06B4FDEC in blender.exe: Microsoft C++ exception: sycl::_V1::runtime_error at memory location 0x00000029099FE3C0.
Exception thrown at 0x00007FFD06B4FDEC in blender.exe: Microsoft C++ exception: sycl::_V1::runtime_error at memory location 0x00000029099FE4F0.
Exception thrown at 0x00007FFD06B4FDEC in blender.exe: Microsoft C++ exception: sycl::_V1::runtime_error at memory location 0x00000029099FEEC0.
Unhandled exception at 0x00007FFD070BF61E (ucrtbase.dll) in blender.exe: Fatal program exit requested.
 	ucrtbase.dll!abort()	Unknown
 	ucrtbase.dll!terminate()	Unknown
>	vcruntime140_1.dll!FindHandler<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, unsigned char recursive, int CatchDepth, unsigned __int64 * pMarkerRN) Line 735	C++
 	vcruntime140_1.dll!__InternalCxxFrameHandler<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, int CatchDepth, unsigned __int64 * pMarkerRN, unsigned char recursive) Line 399	C++
 	vcruntime140_1.dll!__InternalCxxFrameHandlerWrapper<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, int CatchDepth, unsigned __int64 * pMarkerRN, unsigned char recursive) Line 234	C++
 	vcruntime140_1.dll!__CxxFrameHandler4(EHExceptionRecord * pExcept, unsigned __int64 RN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC) Line 306	C++
 	ntdll.dll!RtlpExecuteHandlerForException()	Unknown
 	ntdll.dll!RtlDispatchException()	Unknown
 	ntdll.dll!KiUserExceptionDispatch()	Unknown
 	[External Code]	
 	msvcp140.dll!__ExceptionPtrRethrow(const void * _PtrRaw) Line 536	C++
 	tbb.dll!std::rethrow_exception(std::exception_ptr _Ptr) Line 304	C++
 	tbb.dll!tbb::internal::tbb_exception_ptr::throw_self() Line 339	C++
 	tbb.dll!tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task & parent, tbb::task * child) Line 815	C++
 	[Inline Frame] tbb.dll!tbb::internal::generic_scheduler::local_spawn_root_and_wait(tbb::task *) Line 738	C++
 	tbb.dll!tbb::internal::generic_scheduler::spawn_root_and_wait(tbb::task & first, tbb::task * & next) Line 746	C++
 	[Inline Frame] blender.exe!tbb::task::spawn_root_and_wait(tbb::task &) Line 809	C++
 	[Inline Frame] blender.exe!tbb::interface9::internal::start_for<tbb::blocked_range<int>,tbb::internal::parallel_for_body<`ccl::PathTrace::path_trace'::`2'::<lambda_1>,int>,tbb::auto_partitioner const>::run(const tbb::blocked_range<int> &) Line 95	C++
 	[Inline Frame] blender.exe!tbb::parallel_for(const tbb::blocked_range<int> &) Line 215	C++
 	[Inline Frame] blender.exe!tbb::strict_ppl::parallel_for_impl(int) Line 283	C++
 	[Inline Frame] blender.exe!tbb::strict_ppl::parallel_for(int) Line 316	C++
 	blender.exe!ccl::PathTrace::path_trace(ccl::RenderWork & render_work) Line 388	C++
 	blender.exe!ccl::PathTrace::render_pipeline(ccl::RenderWork render_work) Line 199	C++
 	blender.exe!ccl::PathTrace::render(const ccl::RenderWork & render_work) Line 169	C++
 	blender.exe!ccl::Session::run_main_render_loop() Line 201	C++
 	blender.exe!ccl::Session::thread_render() Line 265	C++
 	blender.exe!ccl::Session::thread_run() Line 240	C++
 	[Inline Frame] blender.exe!std::_Func_class<void>::operator()() Line 874	C++
 	blender.exe!ccl::thread::run(void * arg) Line 39	C++
 	[External Code]	

tbb appears to be rethrowing an exception from within the oneAPI kernel.

Note that as per #106266 you may see page faults/access violations when using a debugger...look closely as many of them you can "Continue" past. The above exception is unrecoverable. Occasionally you might even get the display driver to crash and recover (or freeze). I am using driver version 4311. ``` The thread 0x7ba8 has exited with code 0 (0x0). Exception thrown at 0x00007FFD06B4FDEC in blender.exe: Microsoft C++ exception: sycl::_V1::runtime_error at memory location 0x00000029099FE3C0. Exception thrown at 0x00007FFD06B4FDEC in blender.exe: Microsoft C++ exception: sycl::_V1::runtime_error at memory location 0x00000029099FE4F0. Exception thrown at 0x00007FFD06B4FDEC in blender.exe: Microsoft C++ exception: sycl::_V1::runtime_error at memory location 0x00000029099FEEC0. Unhandled exception at 0x00007FFD070BF61E (ucrtbase.dll) in blender.exe: Fatal program exit requested. ``` ``` ucrtbase.dll!abort() Unknown ucrtbase.dll!terminate() Unknown > vcruntime140_1.dll!FindHandler<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, unsigned char recursive, int CatchDepth, unsigned __int64 * pMarkerRN) Line 735 C++ vcruntime140_1.dll!__InternalCxxFrameHandler<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, int CatchDepth, unsigned __int64 * pMarkerRN, unsigned char recursive) Line 399 C++ vcruntime140_1.dll!__InternalCxxFrameHandlerWrapper<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, int CatchDepth, unsigned __int64 * pMarkerRN, unsigned char recursive) Line 234 C++ vcruntime140_1.dll!__CxxFrameHandler4(EHExceptionRecord * pExcept, unsigned __int64 RN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC) Line 306 C++ ntdll.dll!RtlpExecuteHandlerForException() Unknown ntdll.dll!RtlDispatchException() Unknown ntdll.dll!KiUserExceptionDispatch() Unknown [External Code] msvcp140.dll!__ExceptionPtrRethrow(const void * _PtrRaw) Line 536 C++ tbb.dll!std::rethrow_exception(std::exception_ptr _Ptr) Line 304 C++ tbb.dll!tbb::internal::tbb_exception_ptr::throw_self() Line 339 C++ tbb.dll!tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task & parent, tbb::task * child) Line 815 C++ [Inline Frame] tbb.dll!tbb::internal::generic_scheduler::local_spawn_root_and_wait(tbb::task *) Line 738 C++ tbb.dll!tbb::internal::generic_scheduler::spawn_root_and_wait(tbb::task & first, tbb::task * & next) Line 746 C++ [Inline Frame] blender.exe!tbb::task::spawn_root_and_wait(tbb::task &) Line 809 C++ [Inline Frame] blender.exe!tbb::interface9::internal::start_for<tbb::blocked_range<int>,tbb::internal::parallel_for_body<`ccl::PathTrace::path_trace'::`2'::<lambda_1>,int>,tbb::auto_partitioner const>::run(const tbb::blocked_range<int> &) Line 95 C++ [Inline Frame] blender.exe!tbb::parallel_for(const tbb::blocked_range<int> &) Line 215 C++ [Inline Frame] blender.exe!tbb::strict_ppl::parallel_for_impl(int) Line 283 C++ [Inline Frame] blender.exe!tbb::strict_ppl::parallel_for(int) Line 316 C++ blender.exe!ccl::PathTrace::path_trace(ccl::RenderWork & render_work) Line 388 C++ blender.exe!ccl::PathTrace::render_pipeline(ccl::RenderWork render_work) Line 199 C++ blender.exe!ccl::PathTrace::render(const ccl::RenderWork & render_work) Line 169 C++ blender.exe!ccl::Session::run_main_render_loop() Line 201 C++ blender.exe!ccl::Session::thread_render() Line 265 C++ blender.exe!ccl::Session::thread_run() Line 240 C++ [Inline Frame] blender.exe!std::_Func_class<void>::operator()() Line 874 C++ blender.exe!ccl::thread::run(void * arg) Line 39 C++ [External Code] ``` tbb appears to be rethrowing an exception from within the oneAPI kernel.

@sentharn I can confirm that I can observe this crash with my Intel® Arc™ A770 Graphics - but for me, a crash are happening regardless of Embree4/RTHW usage and it is related to denoising - without enabled denoising all works fine, but with denoising the crash appears.
And in fact, this execution problem with enabled denoising is not oneAPI bug, but Blender general issue: I see Blender CPU execution with enabled denoising also crashes in this scene (and without denoising CPU execution works just fine) and CUDA/Optix rendering give me "illegal address" error and only for rendering with enabled denoising.
Yet, there is for sure some issue in oneAPI backend source code, because code should exit gracefully in case of fail computation on GPU (like CUDA/Optix are doing this in this same situation) and report problem in the GUI. The code for this is actually already presented - yet seems don't work properly in this particular case - I will take a look and will try to resolve it.

@sentharn I can confirm that I can observe this crash with my Intel® Arc™ A770 Graphics - but for me, a crash are happening regardless of Embree4/RTHW usage and it is related to denoising - without enabled denoising all works fine, but with denoising the crash appears. And in fact, this execution problem with enabled denoising is not oneAPI bug, but Blender general issue: I see Blender CPU execution with enabled denoising also crashes in this scene (and without denoising CPU execution works just fine) and CUDA/Optix rendering give me "illegal address" error and only for rendering with enabled denoising. Yet, there is for sure some issue in oneAPI backend source code, because code should exit gracefully in case of fail computation on GPU (like CUDA/Optix are doing this in this same situation) and report problem in the GUI. The code for this is actually already presented - yet seems don't work properly in this particular case - I will take a look and will try to resolve it.
Member

Yet, there is for sure some issue in oneAPI backend source code, because code should exit gracefully in case of fail computation on GPU (like CUDA/Optix are doing this in this same situation)... I will take a look and will try to resolve it.

Will mark this as confirmed for now.


Nvidia GPUs are reporting this error only in 3.6 (maybe we need a separate report to handle this?)

> Yet, there is for sure some issue in oneAPI backend source code, because code should exit gracefully in case of fail computation on GPU (like CUDA/Optix are doing this in this same situation)... I will take a look and will try to resolve it. Will mark this as confirmed for now. - - - Nvidia GPUs are reporting this error only in 3.6 (maybe we need a separate report to handle this?)
Pratik Borhade added
Module
Render & Cycles
Status
Confirmed
and removed
Status
Needs Triage
labels 2023-05-01 11:41:45 +02:00

@PratikPB2123
> Nvidia GPUs are reporting this error only in 3.6 (maybe we need a separate report to handle this?)
Something wrong on Blender 3.6 (and 3.6 only) side with execution of this scene in general - CPU rendering is actually crashing as soon as you will enable denoising here. So definitely, some separate report is needed - and let then keep this ticket for oneAPI crash.

@PratikPB2123 **> Nvidia GPUs are reporting this error only in 3.6 (maybe we need a separate report to handle this?)** Something wrong on Blender 3.6 (and 3.6 only) side with execution of this scene in general - CPU rendering is actually crashing as soon as you will enable denoising here. So definitely, some separate report is needed - and let then keep this ticket for oneAPI crash.
Blender Bot added
Status
Resolved
and removed
Status
Confirmed
labels 2023-05-03 12:06:37 +02:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#107356
No description provided.