Cycles oneAPI device #96840

New Issue

Nikita Sirgienko · 2022-03-28T22:51:03+02:00

Nikita Sirgienko commented

2022-03-28 22:51:03 +02:00

Tasks

For release in Blender 3.3:

oneAPI host and kernel code
- NanoVDB support
Windows
- JIT compilation
- AoT compilation (disabled due to build times in CI)
Linux
- JIT compilation
- AoT compilation (disabled due to build times in CI)
User manual update
Benchmark graph for release notes
Wiki instructions for building

For later:

Hardware ray tracing usage
Host memory as fallback when out of VRAM
Hardware texture sampler usage
Memory sharing between devices
Simplify build process:
- Ability to use SDK provided by Intel instead of building own compiler? to be reconsidered once moving to new C++11 ABI.
- Remove need to use shared library? dynamic libraries still needed but integration improved with https://developer.blender.org/rB7eeeaec6da33971ab7805c9a4bfd5f4e186273d1
- Faster AoT build times - partially adressed by https://developer.blender.org/rBdf29211eeb59f54079123e2bc82578a561431290

Compiler Build Instructions

Additional build steps:
For Windows:

Compile intel/llvm project (currently recommended version=sycl-nightly/20221019, later version are also supported):

git clone https://github.com/intel/llvm -b sycl-nightly/20221019
python .\llvm\buildbot\configure.py
python .\llvm\buildbot\compile.py

Or download prebuilded binaries from the release page
2) Download latest dGPU “Intel® Graphics Offline Compiler for OpenCL™ Code” from standalone components webpage: and extract it to the path of your choice.
3) Perform Blender configuration “make …” as usual
4) edit these CMake options (with SYCL_COMPILER_DIR=.\llvm\build\install from step 1) and rebuild

WITH_CYCLES_DEVICE_ONEAPI=1
WITH_CYCLES_ONEAPI_BINARIES=1
CYCLES_ONEAPI_SPIR64_GEN_DEVICES=dg2 # available targets can be listed from ocloc compile --help
SYCL_COMPILER=%SYCL_COMPILER_DIR%\bin\clang++.exe 
SYCL_INCLUDE_DIR=%SYCL_COMPILER_DIR%\include\sycl 
SYCL_LIBRARY=%SYCL_COMPILER_DIR%\lib\sycl.lib
OCLOC_INSTALL_DIR=path of your choice

sycl.dll and pi_level_zero.dll from .\llvm\build\install are needed at runtime and will be installed next to blender executable

step 1 is optional now it's available from https://svn.blender.org/svnroot/bf-blender/trunk/lib/win64_vc15/

For Linux:

Compile intel/llvm project (currently recommended version=sycl-nightly/20221019, later version are also supported):

git clone https://github.com/intel/llvm -b sycl-nightly/20221019
python ./llvm/buildbot/configure.py
python ./llvm/buildbot/compile.py

Follow these steps to build the latest release of the Graphics Compiler: https://github.com/intel/intel-graphics-compiler/blob/master/documentation/build_ubuntu.md and install it to ./llvm/build/install/lib/igc.
adjust runpath of libsycl.so and libpi_level_zero.so: patchelf --set-rpath '$ORIGIN' *.so
Follow these steps to build the latest release of ocloc: https://github.com/intel/compute-runtime/blob/master/BUILD.md and install it to ./llvm/build/install/lib/ocloc
Perform Blender configuration “make …” as usual
edit these CMake options (with SYCL_COMPILER_DIR=./llvm/build/install from step 1) and rebuild

WITH_CYCLES_DEVICE_ONEAPI=1
WITH_CYCLES_ONEAPI_BINARIES=1
CYCLES_ONEAPI_SPIR64_GEN_DEVICES=dg2 # available targets can be listed from ocloc compile --help
SYCL_COMPILER=${SYCL_COMPILER_DIR}/bin/clang++ 
SYCL_INCLUDE_DIR=${SYCL_COMPILER_DIR}/include/sycl 
SYCL_LIBRARY=${SYCL_COMPILER_DIR}/lib/libsycl.so
OCLOC_INSTALL_DIR=path of your choice if it's different than ${SYCL_COMPILER_DIR}/lib/ocloc

libsycl.so and libpi_level_zero.so from ./llvm/build/install are needed at runtime and will be installed to ./lib.

1/2/3/4 are optional since these now are available from https://svn.blender.org/svnroot/bf-blender/trunk/lib/linux_centos7_x86_64/

**Tasks** For release in Blender 3.3: - [x] oneAPI host and kernel code - [x] NanoVDB support - [x] Windows - [x] JIT compilation - [x] AoT compilation (disabled due to build times in CI) - [x] Linux - [x] JIT compilation - [x] AoT compilation (disabled due to build times in CI) - [x] User manual update - [x] Benchmark graph for release notes - [x] Wiki instructions for building For later: - [x] Hardware ray tracing usage - [x] Host memory as fallback when out of VRAM - [ ] Hardware texture sampler usage - [ ] Memory sharing between devices - [ ] Simplify build process: - [ ] Ability to use SDK provided by Intel instead of building own compiler? to be reconsidered once moving to new C++11 ABI. - [x] Remove need to use shared library? dynamic libraries still needed but integration improved with https://developer.blender.org/rB7eeeaec6da33971ab7805c9a4bfd5f4e186273d1 - [x] Faster AoT build times - partially adressed by https://developer.blender.org/rBdf29211eeb59f54079123e2bc82578a561431290 **Compiler Build Instructions** Additional build steps: For Windows: 1) Compile intel/llvm project (currently recommended version=sycl-nightly/20221019, later version are also supported): ``` git clone https://github.com/intel/llvm -b sycl-nightly/20221019 python .\llvm\buildbot\configure.py python .\llvm\buildbot\compile.py ``` Or download prebuilded binaries from [the release page ](https://github.com/intel/llvm/releases/tag/sycl-nightly%2F20221019) 2) Download latest dGPU “Intel® Graphics Offline Compiler for OpenCL™ Code” from [standalone components webpage: ](https://software.intel.com/content/www/us/en/develop/articles/oneapi-standalone-components.html) and extract it to the path of your choice. 3) Perform Blender configuration “make …” as usual 4) edit these CMake options (with SYCL_COMPILER_DIR=.\llvm\build\install from step 1) and rebuild ``` WITH_CYCLES_DEVICE_ONEAPI=1 WITH_CYCLES_ONEAPI_BINARIES=1 CYCLES_ONEAPI_SPIR64_GEN_DEVICES=dg2 # available targets can be listed from ocloc compile --help SYCL_COMPILER=%SYCL_COMPILER_DIR%\bin\clang++.exe SYCL_INCLUDE_DIR=%SYCL_COMPILER_DIR%\include\sycl SYCL_LIBRARY=%SYCL_COMPILER_DIR%\lib\sycl.lib OCLOC_INSTALL_DIR=path of your choice ``` 5) sycl.dll and pi_level_zero.dll from .\llvm\build\install are needed at runtime and will be installed next to blender executable step 1 is optional now it's available from https://svn.blender.org/svnroot/bf-blender/trunk/lib/win64_vc15/ For Linux: 1) Compile intel/llvm project (currently recommended version=sycl-nightly/20221019, later version are also supported): ``` git clone https://github.com/intel/llvm -b sycl-nightly/20221019 python ./llvm/buildbot/configure.py python ./llvm/buildbot/compile.py ``` 2) Follow these steps to build the latest release of the Graphics Compiler: https://github.com/intel/intel-graphics-compiler/blob/master/documentation/build_ubuntu.md and install it to ./llvm/build/install/lib/igc. 3) adjust runpath of libsycl.so and libpi_level_zero.so: patchelf --set-rpath '$ORIGIN' *.so 4) Follow these steps to build the latest release of ocloc: https://github.com/intel/compute-runtime/blob/master/BUILD.md and install it to ./llvm/build/install/lib/ocloc 5) Perform Blender configuration “make …” as usual 6) edit these CMake options (with SYCL_COMPILER_DIR=./llvm/build/install from step 1) and rebuild ``` WITH_CYCLES_DEVICE_ONEAPI=1 WITH_CYCLES_ONEAPI_BINARIES=1 CYCLES_ONEAPI_SPIR64_GEN_DEVICES=dg2 # available targets can be listed from ocloc compile --help SYCL_COMPILER=${SYCL_COMPILER_DIR}/bin/clang++ SYCL_INCLUDE_DIR=${SYCL_COMPILER_DIR}/include/sycl SYCL_LIBRARY=${SYCL_COMPILER_DIR}/lib/libsycl.so OCLOC_INSTALL_DIR=path of your choice if it's different than ${SYCL_COMPILER_DIR}/lib/ocloc ``` 7) libsycl.so and libpi_level_zero.so from ./llvm/build/install are needed at runtime and will be installed to ./lib. 1/2/3/4 are optional since these now are available from https://svn.blender.org/svnroot/bf-blender/trunk/lib/linux_centos7_x86_64/

Nikita Sirgienko commented

2022-03-28 22:51:03 +02:00

Added subscribers: @Sirgienko, @Stefan_Werner

Nikita Sirgienko commented

2022-03-28 22:59:17 +02:00

The suggest patch to add this functionality and with detailed description tracked as D14480

The suggest patch to add this functionality and with detailed description tracked as [D14480](https://archive.blender.org/developer/D14480)

Nikita Sirgienko commented

2022-03-28 23:11:35 +02:00

Some preview about how this implementation is working: https://youtu.be/dyBbSJAW-Js

Alaska commented

2022-03-28 23:56:52 +02:00

Added subscriber: @Alaska

Alaska commented

2022-03-28 23:57:15 +02:00

Changed status from 'Needs Triage' to: 'Confirmed'

Xavier Hallade commented

2022-03-29 11:37:00 +02:00

Added subscriber: @xavierh

Brecht Van Lommel commented

2022-03-29 18:26:41 +02:00

Added subscriber: @brecht

Brecht Van Lommel commented

2022-03-29 18:26:41 +02:00

Can you give an overview of which new files would be included in the Blender install and where they are located?

Download new oneAPI Level Zero dependency and put it into "lib\win64_vc15" directory in new directory "level-zero".

This should be automated as part of our precompiled libraries builder in build_files/build_environment/CMakeLists.txt.

Replace oneAPI default compiler by latest one with these commands (need to be executed from directory with downloaded offline compiler):

Am I understanding correctly that the "oneAPI Base toolkit" contains the compiler, but the version it includes in Windows is too old, so we temporarily have to copy a newer versions over it? If so, I assume this base toolkit will be updated with the newer version in the future?

Perform Blender configuration “make …” from the oneAPI console environment.

I don't think this is something we want to be doing, the whole Blender build should not run in a different environment just for oneAPI.

Run compiled Blender from oneAPI console environment or install oneAPI standalone runtime or export oneAPI dlls paths into system PATH.

This I assume is also a temporary limitation? Since we can't ship Blender like this.

Load oneAPI environment

Similar comment to the Windows case.

Remove oneTBB paths from library and header PATHs in order to avoid problems with Blender 3rdparty dependencies, which don’t support oneTBB.

I suppose this comes from running the full Blender build process inside the oneAPI environment, which we should not do.

Can you give an overview of which new files would be included in the Blender install and where they are located? > 2. Download new oneAPI Level Zero dependency and put it into "lib\win64_vc15" directory in new directory "level-zero". This should be automated as part of our precompiled libraries builder in `build_files/build_environment/CMakeLists.txt`. > 5. Replace oneAPI default compiler by latest one with these commands (need to be executed from directory with downloaded offline compiler): Am I understanding correctly that the "oneAPI Base toolkit" contains the compiler, but the version it includes in Windows is too old, so we temporarily have to copy a newer versions over it? If so, I assume this base toolkit will be updated with the newer version in the future? > 6. Perform Blender configuration “make …” from the oneAPI console environment. I don't think this is something we want to be doing, the whole Blender build should not run in a different environment just for oneAPI. > 7. Run compiled Blender from oneAPI console environment or install oneAPI standalone runtime or export oneAPI dlls paths into system PATH. This I assume is also a temporary limitation? Since we can't ship Blender like this. > 4. Load oneAPI environment Similar comment to the Windows case. > 5. Remove oneTBB paths from library and header PATHs in order to avoid problems with Blender 3rdparty dependencies, which don’t support oneTBB. I suppose this comes from running the full Blender build process inside the oneAPI environment, which we should not do.

Nikita Sirgienko commented

2022-03-29 19:06:26 +02:00

In #96840#1331713, @brecht wrote:
Can you give an overview of which new files would be included in the Blender install and where they are located?

This files will be new one:

cycles_kernel_oneapi.dll (~20 MB, but the size depends from the implementation, so may change, but not dramatically)
Source files in 3.2\scripts\addons\cycles\source\kernel\device, similar to other backend (but only if it is required)

Download new oneAPI Level Zero dependency and put it into "lib\win64_vc15" directory in new directory "level-zero".

This should be automated as part of our precompiled libraries builder in build_files/build_environment/CMakeLists.txt.

Yes, it is possible, we have already have some code for this, but it is not in the patch yet (we have plan to add this during begin of Bcon2 phase, if it is possible).

Replace oneAPI default compiler by latest one with these commands (need to be executed from directory with downloaded offline compiler):

Am I understanding correctly that the "oneAPI Base toolkit" contains the compiler, but the version it includes in Windows is too old, so we temporarily have to copy a newer versions over it? If so, I assume this base toolkit will be updated with the newer version in the future?

Yes, exactly, this workaround is temporal - after few releases of "oneAPI Base toolkit" this won't be needed anymore.

Perform Blender configuration “make …” from the oneAPI console environment.

I don't think this is something we want to be doing, the whole Blender build should not run in a different environment just for oneAPI.

It is possible to change CMake code in order to find oneAPI required files (like compiler) automatically from default environment.

Run compiled Blender from oneAPI console environment or install oneAPI standalone runtime or export oneAPI dlls paths into system PATH.

This I assume is also a temporary limitation? Since we can't ship Blender like this.

True, we are planning to give Blender ability to build and ship binaries files for oneAPI like it have been done for other dependencies (oneAPI source code is in open-source).
We have plan this to check that all works fine and then suggest the required change during Bcon2 phase - and then Blender won't need any runtime installation (but will need to ship 4 additional
dlls with size ~5MB in total)

Load oneAPI environment

Similar comment to the Windows case.

Similar to Windows, this is a temporal solution and will be improved later.

Remove oneTBB paths from library and header PATHs in order to avoid problems with Blender 3rdparty dependencies, which don’t support oneTBB.

I suppose this comes from running the full Blender build process inside the oneAPI environment, which we should not do.

True - if Blender will search for oneAPI files without loading of environment, then this problem won't appear.

> In #96840#1331713, @brecht wrote: > Can you give an overview of which new files would be included in the Blender install and where they are located? This files will be new one: - cycles_kernel_oneapi.dll (~20 MB, but the size depends from the implementation, so may change, but not dramatically) - Source files in 3.2\scripts\addons\cycles\source\kernel\device, similar to other backend (but only if it is required) >> 2. Download new oneAPI Level Zero dependency and put it into "lib\win64_vc15" directory in new directory "level-zero". > > This should be automated as part of our precompiled libraries builder in `build_files/build_environment/CMakeLists.txt`. > Yes, it is possible, we have already have some code for this, but it is not in the patch yet (we have plan to add this during begin of Bcon2 phase, if it is possible). >> 5. Replace oneAPI default compiler by latest one with these commands (need to be executed from directory with downloaded offline compiler): > Am I understanding correctly that the "oneAPI Base toolkit" contains the compiler, but the version it includes in Windows is too old, so we temporarily have to copy a newer versions over it? If so, I assume this base toolkit will be updated with the newer version in the future? Yes, exactly, this workaround is temporal - after few releases of "oneAPI Base toolkit" this won't be needed anymore. >> 6. Perform Blender configuration “make …” from the oneAPI console environment. > I don't think this is something we want to be doing, the whole Blender build should not run in a different environment just for oneAPI. It is possible to change CMake code in order to find oneAPI required files (like compiler) automatically from default environment. >> 7. Run compiled Blender from oneAPI console environment or install oneAPI standalone runtime or export oneAPI dlls paths into system PATH. > This I assume is also a temporary limitation? Since we can't ship Blender like this. True, we are planning to give Blender ability to build and ship binaries files for oneAPI like it have been done for other dependencies (oneAPI source code is in open-source). We have plan this to check that all works fine and then suggest the required change during Bcon2 phase - and then Blender won't need any runtime installation (but will need to ship 4 additional dlls with size ~5MB in total) >> 4. Load oneAPI environment > Similar comment to the Windows case. Similar to Windows, this is a temporal solution and will be improved later. >> 5. Remove oneTBB paths from library and header PATHs in order to avoid problems with Blender 3rdparty dependencies, which don’t support oneTBB. > I suppose this comes from running the full Blender build process inside the oneAPI environment, which we should not do. True - if Blender will search for oneAPI files without loading of environment, then this problem won't appear.

Nikita Sirgienko commented

2022-03-30 16:45:59 +02:00

Updated build steps - no need to have oneAPI environment loaded anymore and problem with oneTBB on Linux also have gone.

Brecht Van Lommel commented

2022-03-30 17:05:22 +02:00

Thanks, but we should not be requiring modifying PATH or LD_LIBRARY_PATH either. We don't require it for any other Blender features, and that sort of thing has a tendency to cause hard to track issues.

It should be a CMake variable that you can pass, which is then used to locate the files, or if necessary added to the PATH or LD_LIBRARY_PATH for just the oneAPI kernel compilation commands.

In #96840#1331738, @Sirgienko wrote:
We have plan this to check that all works fine and then suggest the required change during Bcon2 phase - and then Blender won't need any runtime installation (but will need to ship 4 additional
dlls with size ~5MB in total)

Ok, but note we are not going to enable oneAPI in Blender builds until this is resolved.

Thanks, but we should not be requiring modifying `PATH` or `LD_LIBRARY_PATH` either. We don't require it for any other Blender features, and that sort of thing has a tendency to cause hard to track issues. It should be a CMake variable that you can pass, which is then used to locate the files, or if necessary added to the `PATH` or `LD_LIBRARY_PATH` for just the oneAPI kernel compilation commands. > In #96840#1331738, @Sirgienko wrote: > We have plan this to check that all works fine and then suggest the required change during Bcon2 phase - and then Blender won't need any runtime installation (but will need to ship 4 additional > dlls with size ~5MB in total) Ok, but note we are not going to enable oneAPI in Blender builds until this is resolved.

Ray molenkamp commented

2022-03-30 17:26:49 +02:00

Added subscriber: @LazyDodo

Ray molenkamp commented

2022-03-30 17:26:49 +02:00

Trying to follow along here, just to get a feel what it would take to get this to build, my initial feedback on the instructions for Windows:

Install latest Intel® oneAPI Base toolkit for Windows

Can you be more specific to what packages are required ? Installing 30GB worth of stuff, some of which will interfere with tools I use on a daily basis (vtune breaks xperf for instance) isn't great, if we could trim this down to just the bare essentials that are required for building, that be ideal.

Download new oneAPI Level Zero dependency and put it into "lib\win64_vc15" directory in new directory "level-zero".

I put some preliminary libs (windows only) in SVN, this step shouldn't be needed anymore, but it admit i haven't quite been able to get the oneAPI stuff building end to end, so haven't been able to confirm they are correct.

Download latest “Intel® Graphics Offline Compiler for OpenCL™ Code” from standalone components webpage:.

this one seems easy enough, i grabbed latest (as of now 101.1404) however having a minimum required version documented wouldn't be the worst idea here

Install latest driver for Intel GPU on Windows.

The CI env is likely going to be virtualized, why is there a driver requirement?

Replace oneAPI default compiler by latest one with these commands (need to be executed from directory with downloaded offline compiler):

Not ideal, but seems easy enough.

and then Blender won't need any runtime installation (but will need to ship 4 additional dlls with size ~5MB in total)

I'm weary here, what are these 4 dlls and where do they come from?

Trying to follow along here, just to get a feel what it would take to get this to build, my initial feedback on the instructions for Windows: > Install latest Intel® oneAPI Base toolkit for Windows Can you be more specific to what packages are required ? Installing 30GB worth of stuff, some of which will interfere with tools I use on a daily basis (vtune breaks xperf for instance) isn't great, if we could trim this down to just the bare essentials that are required for building, that be ideal. >Download new oneAPI Level Zero dependency and put it into "lib\win64_vc15" directory in new directory "level-zero". I put some preliminary libs (windows only) in SVN, this step shouldn't be needed anymore, but it admit i haven't quite been able to get the oneAPI stuff building end to end, so haven't been able to confirm they are correct. >Download latest “Intel® Graphics Offline Compiler for OpenCL™ Code” from standalone components webpage:. this one seems easy enough, i grabbed latest (as of now 101.1404) however having a minimum required version documented wouldn't be the worst idea here >Install latest driver for Intel GPU on Windows. The CI env is likely going to be virtualized, why is there a driver requirement? >Replace oneAPI default compiler by latest one with these commands (need to be executed from directory with downloaded offline compiler): Not ideal, but seems easy enough. >and then Blender won't need any runtime installation (but will need to ship 4 additional dlls with size ~5MB in total) I'm weary here, what are these 4 dlls and where do they come from?

Brecht Van Lommel commented

2022-03-30 18:22:37 +02:00

To clarify my expectations:

For the initial code merge and 3.2 release, we can accept some build and code complexity and clean it up later, given the importance of this feature.
As a minimum for this to be enabled for 3.2 builds, it must be possible for users to use this feature by installing just the Intel GPU driver and nothing else, and for this feature to be gracefully disabled if there is no Intel GPU or driver. It must also be possible for us to build this on our buildbot machines without an Intel GPU driver.
Longer term, I really want to see the complexity of this be reduced to be more similar to other device backends. Ideally:
No host side dependency on SYCL, but using purely a dynamically loaded level zero driver library. Bundled binary or SPIR-V kernels that are not dlls, but more similar to CUDA or HIP binary kernel files that can be loaded and have kernels invoked without single source compilation mechanisms.
** A way to compile such binary kernels with a simple command like /path/to/compiler source_file -o binary_file, without external cmake projects, visual studio xml files, special environments, etc.

To clarify my expectations: * For the initial code merge and 3.2 release, we can accept some build and code complexity and clean it up later, given the importance of this feature. * As a minimum for this to be enabled for 3.2 builds, it must be possible for users to use this feature by installing just the Intel GPU driver and nothing else, and for this feature to be gracefully disabled if there is no Intel GPU or driver. It must also be possible for us to build this on our buildbot machines without an Intel GPU driver. * Longer term, I really want to see the complexity of this be reduced to be more similar to other device backends. Ideally: **No host side dependency on SYCL, but using purely a dynamically loaded level zero driver library.** Bundled binary or SPIR-V kernels that are not dlls, but more similar to CUDA or HIP binary kernel files that can be loaded and have kernels invoked without single source compilation mechanisms. ** A way to compile such binary kernels with a simple command like `/path/to/compiler source_file -o binary_file`, without external cmake projects, visual studio xml files, special environments, etc.

Nikita Sirgienko commented

2022-03-30 19:57:23 +02:00

@brecht, can you please clarify, why this task is connected to BF Blender 2.90 project?

Brecht Van Lommel commented

2022-03-30 20:02:39 +02:00

Seems it got automatically added when I moved this to Under Development, I'll remove that.

Nikita Sirgienko commented

2022-03-30 21:04:24 +02:00

Longer term, I really want to see the complexity of this be reduced to be more similar to other device backends.

I got your points about the expectations, but can you please be more specific that you mean by "long term" (just to be on the same wave here)? I mean, do you mean that this should be adressed before Blender 3.3 or Blender 3.4 or "until next year" or even "until 2023"?

> Longer term, I really want to see the complexity of this be reduced to be more similar to other device backends. I got your points about the expectations, but can you please be more specific that you mean by "long term" (just to be on the same wave here)? I mean, do you mean that this should be adressed before Blender 3.3 or Blender 3.4 or "until next year" or even "until 2023"?

Nikita Sirgienko commented

2022-03-30 22:43:38 +02:00

In #96840#1332381, @LazyDodo wrote:

Install latest Intel® oneAPI Base toolkit for Windows

Can you be more specific to what packages are required ? Installing 30GB worth of stuff, some of which will interfere with tools I use on a daily basis (vtune breaks xperf for instance) isn't great, if we could trim this down to just the bare essentials that are required for building, that be ideal.

Well, VTune not needed for compilation, yes. You have a good point here - I will check, which exactly components is trully required and I will update builds steps accordinly.

Download new oneAPI Level Zero dependency and put it into "lib\win64_vc15" directory in new directory "level-zero".
Download latest “Intel® Graphics Offline Compiler for OpenCL™ Code” from standalone components webpage:.

this one seems easy enough, i grabbed latest (as of now 101.1404) however having a minimum required version documented wouldn't be the worst idea here

The lowerest support version for compilation and execution will be 101.1661 if I am not mistaken about version number - it is not yet released, but will be available publicly on this week.

Install latest driver for Intel GPU on Windows.

The CI env is likely going to be virtualized, why is there a driver requirement?

Well, it is actually needed for execution only, but, well, if the CI won't do an execution, then you don't need a driver on Windows at all. You will still need to get few packages related to GPU compiler and distributed along with driver distribution on Linux, but still there are no need to have really working driver and HW there for compilation either.

and then Blender won't need any runtime installation (but will need to ship 4 additional dlls with size ~5MB in total)

I'm weary here, what are these 4 dlls and where do they come from?

Well, you will need "sycl.dll", "ze_loader.dll", "pi_opencl.dll", "pi_level_zero.dll" shared libraries in order to able to load (and run) oneAPI rendering in Blender (overwise, a initialization code of oneAPI backend will found that these libraries are not available, and then will just safely disable oneAPI implementation). Right now this libraries is distributed with Intel® oneAPI Base toolkit or with Intel® oneAPI DPC++/C++ Compiler Runtime for Windows and it is also possible to get them by building oneAPI from source (I guess it will be easiest way for Blender, because in this case Blender can just distribute them like other 3rdparty Blender dependencies without any need to install something from end-user size).

> In #96840#1332381, @LazyDodo wrote: >> Install latest Intel® oneAPI Base toolkit for Windows > Can you be more specific to what packages are required ? Installing 30GB worth of stuff, some of which will interfere with tools I use on a daily basis (vtune breaks xperf for instance) isn't great, if we could trim this down to just the bare essentials that are required for building, that be ideal. Well, VTune not needed for compilation, yes. You have a good point here - I will check, which exactly components is trully required and I will update builds steps accordinly. >>Download new oneAPI Level Zero dependency and put it into "lib\win64_vc15" directory in new directory "level-zero". >>Download latest “Intel® Graphics Offline Compiler for OpenCL™ Code” from standalone components webpage:. > this one seems easy enough, i grabbed latest (as of now 101.1404) however having a minimum required version documented wouldn't be the worst idea here The lowerest support version for compilation and execution will be 101.1661 if I am not mistaken about version number - it is not yet released, but will be available publicly on this week. >>Install latest driver for Intel GPU on Windows. > The CI env is likely going to be virtualized, why is there a driver requirement? Well, it is actually needed for execution only, but, well, if the CI won't do an execution, then you don't need a driver on Windows at all. You will still need to get few packages related to GPU compiler and distributed along with driver distribution on Linux, but still there are no need to have really working driver and HW there for compilation either. >>and then Blender won't need any runtime installation (but will need to ship 4 additional dlls with size ~5MB in total) > > I'm weary here, what are these 4 dlls and where do they come from? Well, you will need "sycl.dll", "ze_loader.dll", "pi_opencl.dll", "pi_level_zero.dll" shared libraries in order to able to load (and run) oneAPI rendering in Blender (overwise, a initialization code of oneAPI backend will found that these libraries are not available, and then will just safely disable oneAPI implementation). Right now this libraries is distributed with Intel® oneAPI Base toolkit or with Intel® oneAPI DPC++/C++ Compiler Runtime for Windows and it is also possible to get them by building oneAPI from source (I guess it will be easiest way for Blender, because in this case Blender can just distribute them like other 3rdparty Blender dependencies without any need to install something from end-user size).

Nikita Sirgienko commented

2022-03-30 22:52:07 +02:00

In #96840#1332365, @brecht wrote:
Thanks, but we should not be requiring modifying PATH or LD_LIBRARY_PATH either. We don't require it for any other Blender features, and that sort of thing has a tendency to cause hard to track issues.

It should be a CMake variable that you can pass, which is then used to locate the files, or if necessary added to the PATH or LD_LIBRARY_PATH for just the oneAPI kernel compilation commands.

I will try to make it happens and modify paths only for part of the build system, where oneAPI kernel compilation happens (because it is needed only there).
Also, it seems entire idea of "oneAPI environment" don't playing very well, so it possible, that oneAPI itself will address this issue in future - and then we will able to remove all related to this CMake code.

> In #96840#1332365, @brecht wrote: > Thanks, but we should not be requiring modifying `PATH` or `LD_LIBRARY_PATH` either. We don't require it for any other Blender features, and that sort of thing has a tendency to cause hard to track issues. > > It should be a CMake variable that you can pass, which is then used to locate the files, or if necessary added to the `PATH` or `LD_LIBRARY_PATH` for just the oneAPI kernel compilation commands. I will try to make it happens and modify paths only for part of the build system, where oneAPI kernel compilation happens (because it is needed only there). Also, it seems entire idea of "oneAPI environment" don't playing very well, so it possible, that oneAPI itself will address this issue in future - and then we will able to remove all related to this CMake code.

Xavier Hallade commented

2022-03-31 09:21:41 +02:00

and then Blender won't need any runtime installation (but will need to ship 4 additional dlls with size ~5MB in total)

I'm weary here, what are these 4 dlls and where do they come from?

Well, you will need "sycl.dll", "ze_loader.dll", "pi_opencl.dll", "pi_level_zero.dll" shared libraries in order to able to load (and run) oneAPI rendering in Blender (overwise, a initialization code of oneAPI backend will found that these libraries are not available, and then will just safely disable oneAPI implementation). Right now this libraries is distributed with Intel® oneAPI Base toolkit or with Intel® oneAPI DPC++/C++ Compiler Runtime for Windows and it is also possible to get them by building oneAPI from source (I guess it will be easiest way for Blender, because in this case Blender can just distribute them like other 3rdparty Blender dependencies without any need to install something from end-user size).

On Windows, ze_loader.dll comes with graphics drivers and is installed in System32, no need to distribute it or ask the user to install anything (beyond graphics drivers of course).
sycl.dll, pi_opencl.dll and pi_level_zero.dll can also be grabbed from "Intel DPC++/C++ Compiler for Windows " installation as redistributables (they're listed in licensing\credist.txt), we should also be able to recompile these from sources.

>>> and then Blender won't need any runtime installation (but will need to ship 4 additional dlls with size ~5MB in total) >>> >> I'm weary here, what are these 4 dlls and where do they come from? > > Well, you will need "sycl.dll", "ze_loader.dll", "pi_opencl.dll", "pi_level_zero.dll" shared libraries in order to able to load (and run) oneAPI rendering in Blender (overwise, a initialization code of oneAPI backend will found that these libraries are not available, and then will just safely disable oneAPI implementation). Right now this libraries is distributed with Intel® oneAPI Base toolkit or with Intel® oneAPI DPC++/C++ Compiler Runtime for Windows and it is also possible to get them by building oneAPI from source (I guess it will be easiest way for Blender, because in this case Blender can just distribute them like other 3rdparty Blender dependencies without any need to install something from end-user size). On Windows, ze_loader.dll comes with graphics drivers and is installed in System32, no need to distribute it or ask the user to install anything (beyond graphics drivers of course). sycl.dll, pi_opencl.dll and pi_level_zero.dll can also be grabbed from "[Intel DPC++/C++ Compiler for Windows ](https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html)" installation as redistributables (they're listed in licensing\credist.txt), we should also be able to recompile these from sources.

Brecht Van Lommel commented

2022-03-31 18:54:33 +02:00

In #96840#1332541, @Sirgienko wrote:

Longer term, I really want to see the complexity of this be reduced to be more similar to other device backends.

I got your points about the expectations, but can you please be more specific that you mean by "long term" (just to be on the same wave here)? I mean, do you mean that this should be adressed before Blender 3.3 or Blender 3.4 or "until next year" or even "until 2023"?

I don't have specific expectations about timelines here, the sooner the better but if it takes until next year so be it. After the initial version is in we can discuss in more detail about this, since I don't know exactly how deep some limitations are, or how exactly it will work for OSL or potential hardware-raytracing and how that will all fit together.

> In #96840#1332541, @Sirgienko wrote: >> Longer term, I really want to see the complexity of this be reduced to be more similar to other device backends. > I got your points about the expectations, but can you please be more specific that you mean by "long term" (just to be on the same wave here)? I mean, do you mean that this should be adressed before Blender 3.3 or Blender 3.4 or "until next year" or even "until 2023"? I don't have specific expectations about timelines here, the sooner the better but if it takes until next year so be it. After the initial version is in we can discuss in more detail about this, since I don't know exactly how deep some limitations are, or how exactly it will work for OSL or potential hardware-raytracing and how that will all fit together.

Sergey Sharybin commented

2022-04-01 09:57:45 +02:00

Added subscriber: @Sergey

Sergey Sharybin commented

2022-04-01 09:57:45 +02:00

On Windows, ze_loader.dll comes with graphics drivers and is installed in System32, no need to distribute it or ask the user to install anything (beyond graphics drivers of course).

Is it the same on Linux, or do we need to do something special in this platform?

> On Windows, ze_loader.dll comes with graphics drivers and is installed in System32, no need to distribute it or ask the user to install anything (beyond graphics drivers of course). Is it the same on Linux, or do we need to do something special in this platform?

Sergey Sharybin commented

2022-04-01 15:43:01 +02:00

sycl.dll, pi_opencl.dll and pi_level_zero.dll can also be grabbed from "Intel DPC++/C++ Compiler for Windows" installation as redistributables (they're listed in licensing\credist.txt), we should also be able to recompile these from sources.

According to the credist.txt these libraries are covered with "Intel End User License Agreement for Developer Tools" which does not seem to be compatible with GPL. If that's the case then we can not put those libraries to the Blender release.

> sycl.dll, pi_opencl.dll and pi_level_zero.dll can also be grabbed from "Intel DPC++/C++ Compiler for Windows" installation as redistributables (they're listed in licensing\credist.txt), we should also be able to recompile these from sources. According to the `credist.txt` these libraries are covered with "Intel End User License Agreement for Developer Tools" which does not seem to be compatible with GPL. If that's the case then we can not put those libraries to the Blender release.

Xavier Hallade commented

2022-04-01 16:12:34 +02:00

In #96840#1333414, @Sergey wrote:

On Windows, ze_loader.dll comes with graphics drivers and is installed in System32, no need to distribute it or ask the user to install anything (beyond graphics drivers of course).

Is it the same on Linux, or do we need to do something special in this platform?

on linux, libze_loader.so comes from level-zero package, we advertise it in our gfx drivers installation guides but I don't know if it tends to be there in practice: https://dgpu-docs.intel.com/installation-guides/index.html

In #96840#1333613, @Sergey wrote:

sycl.dll, pi_opencl.dll and pi_level_zero.dll can also be grabbed from "Intel DPC++/C++ Compiler for Windows" installation as redistributables (they're listed in licensing\credist.txt), we should also be able to recompile these from sources.

According to the credist.txt these libraries are covered with "Intel End User License Agreement for Developer Tools" which does not seem to be compatible with GPL. If that's the case then we can not put those libraries to the Blender release.

would using binaries you'd compile from https://github.com/intel/llvm be acceptable? compilation itself is quick, I've tried 2022-WW13 branch and it "works on my machine". If it's the solution you want to go with we can refine it to make it more reliable.

python buildbot\configure.py
python buildbot\compile.py -t pi_level_zero.dll
python buildbot\compile.py -t pi_opencl.dll
python buildbot\compile.py -t sycl.dll

that generates these in build\bin

> In #96840#1333414, @Sergey wrote: >> On Windows, ze_loader.dll comes with graphics drivers and is installed in System32, no need to distribute it or ask the user to install anything (beyond graphics drivers of course). > > Is it the same on Linux, or do we need to do something special in this platform? on linux, libze_loader.so comes from level-zero package, we advertise it in our gfx drivers installation guides but I don't know if it tends to be there in practice: https://dgpu-docs.intel.com/installation-guides/index.html > In #96840#1333613, @Sergey wrote: >> sycl.dll, pi_opencl.dll and pi_level_zero.dll can also be grabbed from "Intel DPC++/C++ Compiler for Windows" installation as redistributables (they're listed in licensing\credist.txt), we should also be able to recompile these from sources. > > According to the `credist.txt` these libraries are covered with "Intel End User License Agreement for Developer Tools" which does not seem to be compatible with GPL. If that's the case then we can not put those libraries to the Blender release. would using binaries you'd compile from https://github.com/intel/llvm be acceptable? compilation itself is quick, I've tried 2022-WW13 branch and it "works on my machine". If it's the solution you want to go with we can refine it to make it more reliable. ``` python buildbot\configure.py python buildbot\compile.py -t pi_level_zero.dll python buildbot\compile.py -t pi_opencl.dll python buildbot\compile.py -t sycl.dll ``` that generates these in build\bin

Ray molenkamp commented

2022-04-01 16:31:22 +02:00

while that may generate the dll's, they are intrinsically linked to the dpcpp version they came out of, if they in the future add a parameter somewhere or change a type all hell will break lose if people download a new compiler from the intel website and use our 3 libs run against (also sycl.dll needs two other dlls svml_dispmd.dll and libmmd.dll, so it's 5 really) the only way I could see this work is

the license in credist.txt changes and we can take them from the installed compiler instance
We build the whole thing from source, and ship it in svn.

while that may generate the dll's, they are intrinsically linked to the dpcpp version they came out of, if they in the future add a parameter somewhere or change a type all hell will break lose if people download a new compiler from the intel website and use our 3 libs run against (also `sycl.dll` needs two other dlls `svml_dispmd.dll` and `libmmd.dll`, so it's 5 really) the only way I could see this work is 1) the license in credist.txt changes and we can take them from the installed compiler instance 2) We build the whole thing from source, and ship it in svn.

Sergey Sharybin commented

2022-04-01 16:33:00 +02:00

@xavierh, How reliable it is from a point of view of a newer DPC++/sycl for compilation? Those libraries seems to be quite low-level, easy to run out of sync from the compiler.
If we are to compile parts of the compiler, it might be better to compile the entire compiler, avoiding incompatibility surprises. But that's quite an escalation of scale of the project.
It might be easier and better for OpenSource in general if the redistreibutable libraries from Intel package will retain the code's license (which is Apache 2, I believe, in this case).

@xavierh, How reliable it is from a point of view of a newer DPC++/sycl for compilation? Those libraries seems to be quite low-level, easy to run out of sync from the compiler. If we are to compile parts of the compiler, it might be better to compile the entire compiler, avoiding incompatibility surprises. But that's quite an escalation of scale of the project. It might be easier and better for OpenSource in general if the redistreibutable libraries from Intel package will retain the code's license (which is Apache 2, I believe, in this case).

Xavier Hallade commented

2022-04-01 16:51:08 +02:00

we should be able to remove need for svml_dispmd.dll and libmmd.dll by using their static versions.

The reliability is a question to me, as I said it's working on my machine, but yes there may be compatibility issues, that's something I need to get confirmations on.
Building the entire compiler is also an option but it'd be massive to ship it bundled, and if not shipping it, it will not solve potential compatibility issues with drivers runtimes.

I'll ask about credist.txt license change, I agree it'd be better.

we should be able to remove need for svml_dispmd.dll and libmmd.dll by using their static versions. The reliability is a question to me, as I said it's working on my machine, but yes there may be compatibility issues, that's something I need to get confirmations on. Building the entire compiler is also an option but it'd be massive to ship it bundled, and if not shipping it, it will not solve potential compatibility issues with drivers runtimes. I'll ask about credist.txt license change, I agree it'd be better.

Xavier Hallade commented

2022-04-04 09:53:49 +02:00

Compatibility with drivers should be as fine as with proprietary version but from: https://github.com/intel/llvm/blob/sycl/README.md

None of the branches in the project are stable or rigorously tested for production quality control, so the quality of these releases is expected to be similar to the daily releases.

Intel(R) oneAPI DPC++ Compiler is considered a downstream project and has a different license.
Is it really unacceptable in your view to have users download its runtime separately, if the other option right now is to use the less tested open source version - (but that also solves the Centos 7 compatibility issue until addressed downstream) ?
In both cases proprietary ocloc can be used: https:*intel.github.io/llvm-docs/GetStartedGuide.html#gpu + https:*www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html

Compatibility with drivers should be as fine as with proprietary version but from: https://github.com/intel/llvm/blob/sycl/README.md > None of the branches in the project are stable or rigorously tested for production quality control, so the quality of these releases is expected to be similar to the daily releases. Intel(R) oneAPI DPC++ Compiler is considered a downstream project and has a different license. Is it really unacceptable in your view to have users download its runtime separately, if the other option right now is to use the less tested open source version - (but that also solves the Centos 7 compatibility issue until addressed downstream) ? In both cases proprietary ocloc can be used: https:*intel.github.io/llvm-docs/GetStartedGuide.html#gpu + https:*www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html

Sergey Sharybin commented

2022-04-04 11:59:37 +02:00

Is it really unacceptable in your view to have users download its runtime separately.

Downloading runtime libraries is against Blender's product vision. No runtime libraries/SDK installation should be needed in order to use Blender.
Drivers are a bit special in this regard, but those are typically pre-installed by an OS.

option right now is to use the less tested open source version

Is it just some special download, or we need to compile toolchain ourselves?

> Is it really unacceptable in your view to have users download its runtime separately. Downloading runtime libraries is against Blender's product vision. No runtime libraries/SDK installation should be needed in order to use Blender. Drivers are a bit special in this regard, but those are typically pre-installed by an OS. > option right now is to use the less tested open source version Is it just some special download, or we need to compile toolchain ourselves?

Xavier Hallade commented

2022-04-04 12:06:02 +02:00

Noted!

The open source version is as I've documented a few comments above. It's https://github.com/intel/llvm and getting the DLLs is a short compilation (not the whole project needs to be built). Let us find the "best" commit to use.

python buildbot\configure.py
python buildbot\compile.py -t pi_level_zero.dll
python buildbot\compile.py -t pi_opencl.dll
python buildbot\compile.py -t sycl.dll

Noted! The open source version is as I've documented a few comments above. It's https://github.com/intel/llvm and getting the DLLs is a short compilation (not the whole project needs to be built). Let us find the "best" commit to use. ``` python buildbot\configure.py python buildbot\compile.py -t pi_level_zero.dll python buildbot\compile.py -t pi_opencl.dll python buildbot\compile.py -t sycl.dll ```

Sergey Sharybin commented

2022-04-04 13:19:13 +02:00

we should be able to remove need for svml_dispmd.dll and libmmd.dll by using their static versions.

I believe from licensing point of view dynamic or static linking is not that relevant. I do not see how static linking can change license of a library. It just makes it harder to see that library is used, but its (partial) content is still in the (derived) product.

The Blender code is GPL2+, and some components (like Cycles) are Apache 2, which renders the final Blender package to be GPL3+. We can not change this. Everything that is being distributed with Blender install must be compatible with GPL3+.

The open source version is as I've documented a few comments above. It's https://github.com/intel/llvm and getting the DLLs is a short compilation (not the whole project needs to be built). Let us find the "best" commit to use.

There is added complexity of this approach for developers. Ignoring that, I am not even sure this fully solves the packaging issue due to svml_dispmd and libmmd.

if a a proprietary compiler is to be used for a device, it should not require putting any proprietary component into the package. Example of this is CUDA integration: a specific compiler needs to be installed on the builder machine, but no extra libraries are to be packaged with the software and everything comes from the driver.

If there is a way to use open-source compiler and libraries we should use those. Ideally, we would not need to re-compile the entire toolchain from scratch and the pre-compiled toolchain from oneAPI will be compatible with open source.

I think it is up to Intel to make it possible to use Intel toolkits usable by open source projects. Those hybrid solutions are always hard to implement in practice and verify from licensing perspective.

Moving forward please contact foundation@blender.org to further discuss licensing topics.

> we should be able to remove need for svml_dispmd.dll and libmmd.dll by using their static versions. I believe from licensing point of view dynamic or static linking is not that relevant. I do not see how static linking can change license of a library. It just makes it harder to see that library is used, but its (partial) content is still in the (derived) product. The Blender code is GPL2+, and some components (like Cycles) are Apache 2, which renders the final Blender package to be GPL3+. We can not change this. Everything that is being distributed with Blender install must be compatible with GPL3+. > The open source version is as I've documented a few comments above. It's https://github.com/intel/llvm and getting the DLLs is a short compilation (not the whole project needs to be built). Let us find the "best" commit to use. There is added complexity of this approach for developers. Ignoring that, I am not even sure this fully solves the packaging issue due to `svml_dispmd` and `libmmd`. if a a proprietary compiler is to be used for a device, it should not require putting any proprietary component into the package. Example of this is CUDA integration: a specific compiler needs to be installed on the builder machine, but no extra libraries are to be packaged with the software and everything comes from the driver. If there is a way to use open-source compiler and libraries we should use those. Ideally, we would not need to re-compile the entire toolchain from scratch and the pre-compiled toolchain from oneAPI will be compatible with open source. I think it is up to Intel to make it possible to use Intel toolkits usable by open source projects. Those hybrid solutions are always hard to implement in practice and verify from licensing perspective. Moving forward please contact foundation@blender.org to further discuss licensing topics.

Xavier Hallade commented

2022-04-04 13:31:44 +02:00

I'm looking into ways to avoid using svml_dispmd and libmmd at all since static linking is for sure not the way, I've just noticed they're covered by the same EULA as the other redistributables, sorry for proposing this.

Xavier Hallade commented

2022-05-18 15:34:41 +02:00

pi_opencl.dll/libpi_opencl.so aren't needed anymore since 15c146091def58c6f1bd156611cda3efff982687

I've updated the overall build steps in the main task description.
Here is a more detailed script how to get the environment ready for compilation.
For JIT only intel/llvm compiler is needed. AoT needs our graphics compiler that can be downloaded for windows / built for Linux.

On Windows:

git clone https://github.com/intel/llvm -b sycl-nighlty/20220501
python .\llvm\buildbot\configure.py
python .\llvm\buildbot\compile.py

# pi_level_zero and sycl libraries are the only runtime dependencies, generated in llvm/build/bin

On Linux:

INTEL_LLVM_TAG=sycl-nightly/20220501
SYCL_HOME=~/sycl_workspace; mkdir $SYCL_HOME

## Compile intel/llvm stack
cd $SYCL_HOME
git clone https://github.com/intel/llvm -b ${INTEL_LLVM_TAG}

# CentOS7 workaround: Remove Abi replacements from CMakeLists (Line 69-72)
sed -i "69,72 {s/^/#/}" ${SYCL_HOME}/llvm/sycl/source/CMakeLists.txt

# Build DPC++
python $SYCL_HOME/llvm/buildbot/configure.py
python $SYCL_HOME/llvm/buildbot/compile.py

# pi_level_zero and sycl libraries are the only runtime dependencies, generated in llvm/build/bin

And on top of this for AoT support since current compile times are long:
On Windows:

- link from https://software.intel.com/content/www/us/en/develop/articles/oneapi-standalone-components.html
- "Intel® Graphics Offline Compiler for OpenCL™ Code"
# Note: dg2 not supported yet in 101.1960.
wget https://registrationcenter-download.intel.com/akdlm/irc_nas/18761/ocloc_win_101.1960.zip
7z e ocloc_win_101.1960.zip -ollvm\build\install\lib\ocloc

On Linux:

# version set from https:*github.com/intel/compute-runtime/releases and https:*github.com/intel/intel-graphics-compiler/releases
LLVM_PROJECT_TAG=llvmorg-11.1.0
OCL_CLANG_TAG=ocl-open-110
LLVM_SPIRV_TAG=llvm_release_110
IGC_TAG=igc-1.0.11222
GMMLIB_TAG=intel-gmmlib-22.1.2
NEO_TAG=22.20.23198

SPIRV_TOOLS_TAG=sdk-1.3.204.1
SPIRV_HEADERS_TAG=sdk-1.3.204.1
IGC_HOME=${SYCL_HOME}/igc_workspace; mkdir ${IGC_HOME};

## Compile Intel Graphics compiler and dependencies
cd ${IGC_HOME};
git clone https://github.com/intel/vc-intrinsics vc-intrinsics
git clone -b ${LLVM_PROJECT_TAG} https://github.com/llvm/llvm-project llvm-project
git clone -b ${OCL_CLANG_TAG} https://github.com/intel/opencl-clang llvm-project/llvm/projects/opencl-clang
git clone -b ${LLVM_SPIRV_TAG} https://github.com/KhronosGroup/SPIRV-LLVM-Translator llvm-project/llvm/projects/llvm-spirv
git clone -b ${SPIRV_TOOLS_TAG} https://github.com/KhronosGroup/SPIRV-Tools.git SPIRV-Tools
git clone -b ${SPIRV_HEADERS_TAG} https://github.com/KhronosGroup/SPIRV-Headers.git SPIRV-Headers
git clone -b ${IGC_TAG} https://github.com/intel/intel-graphics-compiler igc

# Compile IGC
cd ${IGC_HOME}
mkdir build; cd build
cmake -DCMAKE_INSTALL_LIBDIR=lib -DCMAKE_INSTALL_PREFIX=${SYCL_HOME}/llvm/build/install/lib/igc ../igc
make -j || true
# CentOS7_workaround: Erase deprecated register keyword in generated lex.CISA.cpp
sed -i "1 i\#define register" ${IGC_HOME}/build/IGC/visa/lex.CISA.cpp
# Continue compilation
make -j
make install


  - Compile ocloc
- compile GMMlib
cd ${SYCL_HOME}
git clone -b ${GMMLIB_TAG} https://github.com/intel/gmmlib.git; cd gmmlib
mkdir build; cd build
cmake -DCMAKE_INSTALL_LIBDIR=lib -DCMAKE_INSTALL_PREFIX=${SYCL_HOME}/gmmlib/install ..
make -j; make install

# compile compute-runtime
cd ${SYCL_HOME}
git clone -b ${NEO_TAG} https://github.com/intel/compute-runtime.git; cd compute-runtime
mkdir build; cd build 
cmake -DCMAKE_BUILD_TYPE=Release -DNEO_SKIP_UNIT_TESTS=1 -DNEO_BUILD_WITH_OCL=0 -DBUILD_WITH_L0=0 -DIGC_DIR=${SYCL_HOME}/llvm/build/install/lib/igc -DGMM_DIR=${SYCL_HOME}/gmmlib/install -DCMAKE_INSTALL_LIBDIR=lib -DCMAKE_INSTALL_PREFIX=${SYCL_HOME}/llvm/build/install/lib/ocloc ..
cmake --build offline_compiler
cmake --install offline_compiler

pi_opencl.dll/libpi_opencl.so aren't needed anymore since 15c146091def58c6f1bd156611cda3efff982687 I've updated the overall build steps in the main task description. Here is a more detailed script how to get the environment ready for compilation. For JIT only intel/llvm compiler is needed. AoT needs our graphics compiler that can be downloaded for windows / built for Linux. On Windows: ``` git clone https://github.com/intel/llvm -b sycl-nighlty/20220501 python .\llvm\buildbot\configure.py python .\llvm\buildbot\compile.py # pi_level_zero and sycl libraries are the only runtime dependencies, generated in llvm/build/bin ``` On Linux: ``` INTEL_LLVM_TAG=sycl-nightly/20220501 SYCL_HOME=~/sycl_workspace; mkdir $SYCL_HOME ## Compile intel/llvm stack cd $SYCL_HOME git clone https://github.com/intel/llvm -b ${INTEL_LLVM_TAG} # CentOS7 workaround: Remove Abi replacements from CMakeLists (Line 69-72) sed -i "69,72 {s/^/#/}" ${SYCL_HOME}/llvm/sycl/source/CMakeLists.txt # Build DPC++ python $SYCL_HOME/llvm/buildbot/configure.py python $SYCL_HOME/llvm/buildbot/compile.py # pi_level_zero and sycl libraries are the only runtime dependencies, generated in llvm/build/bin ``` And on top of this for AoT support since current compile times are long: On Windows: ``` - link from https://software.intel.com/content/www/us/en/develop/articles/oneapi-standalone-components.html - "Intel® Graphics Offline Compiler for OpenCL™ Code" # Note: dg2 not supported yet in 101.1960. wget https://registrationcenter-download.intel.com/akdlm/irc_nas/18761/ocloc_win_101.1960.zip 7z e ocloc_win_101.1960.zip -ollvm\build\install\lib\ocloc ``` On Linux: ``` # version set from https:*github.com/intel/compute-runtime/releases and https:*github.com/intel/intel-graphics-compiler/releases LLVM_PROJECT_TAG=llvmorg-11.1.0 OCL_CLANG_TAG=ocl-open-110 LLVM_SPIRV_TAG=llvm_release_110 IGC_TAG=igc-1.0.11222 GMMLIB_TAG=intel-gmmlib-22.1.2 NEO_TAG=22.20.23198 SPIRV_TOOLS_TAG=sdk-1.3.204.1 SPIRV_HEADERS_TAG=sdk-1.3.204.1 IGC_HOME=${SYCL_HOME}/igc_workspace; mkdir ${IGC_HOME}; ## Compile Intel Graphics compiler and dependencies cd ${IGC_HOME}; git clone https://github.com/intel/vc-intrinsics vc-intrinsics git clone -b ${LLVM_PROJECT_TAG} https://github.com/llvm/llvm-project llvm-project git clone -b ${OCL_CLANG_TAG} https://github.com/intel/opencl-clang llvm-project/llvm/projects/opencl-clang git clone -b ${LLVM_SPIRV_TAG} https://github.com/KhronosGroup/SPIRV-LLVM-Translator llvm-project/llvm/projects/llvm-spirv git clone -b ${SPIRV_TOOLS_TAG} https://github.com/KhronosGroup/SPIRV-Tools.git SPIRV-Tools git clone -b ${SPIRV_HEADERS_TAG} https://github.com/KhronosGroup/SPIRV-Headers.git SPIRV-Headers git clone -b ${IGC_TAG} https://github.com/intel/intel-graphics-compiler igc # Compile IGC cd ${IGC_HOME} mkdir build; cd build cmake -DCMAKE_INSTALL_LIBDIR=lib -DCMAKE_INSTALL_PREFIX=${SYCL_HOME}/llvm/build/install/lib/igc ../igc make -j || true # CentOS7_workaround: Erase deprecated register keyword in generated lex.CISA.cpp sed -i "1 i\#define register" ${IGC_HOME}/build/IGC/visa/lex.CISA.cpp # Continue compilation make -j make install - Compile ocloc - compile GMMlib cd ${SYCL_HOME} git clone -b ${GMMLIB_TAG} https://github.com/intel/gmmlib.git; cd gmmlib mkdir build; cd build cmake -DCMAKE_INSTALL_LIBDIR=lib -DCMAKE_INSTALL_PREFIX=${SYCL_HOME}/gmmlib/install .. make -j; make install # compile compute-runtime cd ${SYCL_HOME} git clone -b ${NEO_TAG} https://github.com/intel/compute-runtime.git; cd compute-runtime mkdir build; cd build cmake -DCMAKE_BUILD_TYPE=Release -DNEO_SKIP_UNIT_TESTS=1 -DNEO_BUILD_WITH_OCL=0 -DBUILD_WITH_L0=0 -DIGC_DIR=${SYCL_HOME}/llvm/build/install/lib/igc -DGMM_DIR=${SYCL_HOME}/gmmlib/install -DCMAKE_INSTALL_LIBDIR=lib -DCMAKE_INSTALL_PREFIX=${SYCL_HOME}/llvm/build/install/lib/ocloc .. cmake --build offline_compiler cmake --install offline_compiler ```

Ray molenkamp commented

2022-05-19 15:35:14 +02:00

current concerns:

windows uses sycl-nighlty/20220208 linux uses sycl-nightly/20220501 pick a single version please all platforms should be on the same version, we don't want different bugs on different platforms.
Windows has a a binary version of ocloc with no clear license attached, if this is a build time only component this could be ok, but it is concerning and only allowed if we're 100% sure we do not have to distribute any of the files inside that zip.
windows ocoloc: Note: dg2 not supported yet in 101.1960. then what's the point of it? Last time i tried to build on windows, i had to use an older arch to test, and i killed it after about 6 or 8 hours.. not doing that again

the rest looks easy enough to script.

current concerns: 1) windows uses `sycl-nighlty/20220208` linux uses `sycl-nightly/20220501` pick a single version please all platforms should be on the same version, we don't want different bugs on different platforms. 2) Windows has a a binary version of ocloc with no clear license attached, if this is a build time only component this could be ok, but it is concerning and only allowed if we're 100% sure we do not have to distribute any of the files inside that zip. 3) windows ocoloc: `Note: dg2 not supported yet in 101.1960.` then what's the point of it? Last time i tried to build on windows, i had to use an older arch to test, and i killed it after about 6 or 8 hours.. not doing that again the rest *looks* easy enough to script.

Xavier Hallade commented

2022-05-19 15:44:44 +02:00

you can use sycl-nightly/20220501 for both linux and windows, we'll make sure to recommend the same version on both OSes.
ocloc and graphics compiler don't have to be redistributed or windows nor on linux, this is absolutely only a build time component used for AoT binaries.
on your build system, you can use a version I put in our shared drive to support dg2. the point of it is that anybody else can still use the public package and target older versions, and in the near future the public package will also support dg2. there is an issue with compile time when targeting older GPUs but still ~1h should be enough at the moment.

1. you can use sycl-nightly/20220501 for both linux and windows, we'll make sure to recommend the same version on both OSes. 2. ocloc and graphics compiler don't have to be redistributed or windows nor on linux, this is absolutely only a build time component used for AoT binaries. 3. on your build system, you can use a version I put in our shared drive to support dg2. the point of it is that anybody else can still use the public package and target older versions, and in the near future the public package will also support dg2. there is an issue with compile time when targeting older GPUs but still ~1h should be enough at the moment.

Ray molenkamp commented

2022-05-22 22:58:22 +02:00

Scripted sycl (and it's deps) for both windows and linux should be available in the deps builder of cycles_oneapi

still on my todo list:

Compile Intel Graphics compiler and dependencies
ocloc

Scripted sycl (and it's deps) for both windows and linux should be available in the deps builder of `cycles_oneapi` still on my todo list: - Compile Intel Graphics compiler and dependencies - ocloc

Ray molenkamp commented

2022-05-24 20:36:19 +02:00

all the buildscripts should be complete, they should build with no issues on both windows and centos (centos7 disclaimer, the stock flex is outdated, needs to be replaced with a build from source 2.6.4 or you'll have build errors), the blender windows build with dg2 enabled I made @xavierh was able to confirm as working, on linux all the puzzle pieces are in place but due to my centos container running out of ram i could not complete a blender build there.

What is not done yet is the harvesting, i'm not entirely sure how much of this we want to add to SVN, on windows you can get dpcpp down to < 200M so that shouldn't be an issue. On linux however... my install folders..

[root@centos7 Release]# du -h -d1 |grep "dpcpp\|igc\|ocloc"
2.3G    ./dpcpp
1.2M    ./ocloc
1.4G    ./igc

Given the place this branch is now at: @brecht / @Sergey feel free to step in and take over, I've done all I can.

all the buildscripts should be complete, they *should* build with no issues on both windows and centos (centos7 disclaimer, the stock flex is outdated, needs to be replaced with a build from source 2.6.4 or you'll have build errors), the blender windows build with dg2 enabled I made @xavierh was able to confirm as working, on linux all the puzzle pieces are in place but due to my centos container running out of ram i could not complete a blender build there. What is not done yet is the harvesting, i'm not entirely sure how much of this we want to add to SVN, on windows you can get dpcpp down to < 200M so that shouldn't be an issue. On linux however... my install folders.. ``` [root@centos7 Release]# du -h -d1 |grep "dpcpp\|igc\|ocloc" 2.3G ./dpcpp 1.2M ./ocloc 1.4G ./igc ``` Given the place this branch is now at: @brecht / @Sergey feel free to step in and take over, I've done all I can.

Xavier Hallade commented

2022-05-25 09:28:48 +02:00

I think you double counted igc in dpcpp. On my system it's closer to 300M.
libigc is indeed very large... gzip compression improves the situation a little bit but we can make it more reasonable by stripping debug symbols:
994M libigc.so.1.0.1
327M libigc.so.1.0.1.gz
85M libigc.so.1.0.1_stripped-debug-symbols

I think you double counted igc in dpcpp. On my system it's closer to 300M. libigc is indeed very large... gzip compression improves the situation a little bit but we can make it more reasonable by stripping debug symbols: 994M libigc.so.1.0.1 327M libigc.so.1.0.1.gz 85M libigc.so.1.0.1_stripped-debug-symbols

Ray molenkamp commented

2022-05-26 19:00:31 +02:00

soo.. some good news, bad news.

In the good news column: on my ubuntu based WSL I managed to generate a kernel! no issues!

The bad news column is... more interesting..

Got some more ram for the centos container, looks like the problem was NOT memory, I hadn't looked too closely and just assumed it was ram since it was a severe ram constrained environment, turns out ocloc just sits on ~2.1GB for 35 minutes or so before crashing. crashlog in P2977 , actual peek with gdb in P2978

Then I tried building (dpcpp+igc+ocloc+kernel) on my ubuntu 20.04.1 based WSL setup, no issues there, then figured "maybe it's the container?" so I reinstalled centos7 as a full VM with plenty of ram... [half a day of compiling stuff later].... same issue as the container!! (crash after 35mins, 2.1GB tops)

Next idea: is it the build or the environment? Transfered dpcpp/igc/ocloc from centos7 to ubuntu 20.04.1 (needed some old ncurses lib before it ran, but nothing too crazy) kernel builds just fine (and yes, I've validated the right binaries were used)

so the build of dpcpp/igc/ocloc on centos is fine, but it won't have a happy time running on centos

next idea: maybe it's the memory allocator? lets swap it out! when preloading jemalloc centos 35->15 mins (but still crash) ubuntu 27 mins, no difference.

Next idea: kinda iffy it tops out at ~2GB memory usage, maybe a process limit? wrote a quick app that just allocated tons of ram and initalizes it with some random values, no issues going over 2GB

doing a debug build of igc now, hoping it'll be chattier in what is upsetting it, but open to other stuff you may think is useful trying.

soo.. some good news, bad news. In the good news column: on my ubuntu based WSL I managed to generate a kernel! no issues! The bad news column is... more interesting.. Got some more ram for the centos container, looks like the problem was *NOT* memory, I hadn't looked too closely and just assumed it was ram since it was a severe ram constrained environment, turns out ocloc just sits on ~2.1GB for 35 minutes or so before crashing. crashlog in [P2977](https://archive.blender.org/developer/P2977.txt) , actual peek with gdb in [P2978](https://archive.blender.org/developer/P2978.txt) Then I tried building (dpcpp+igc+ocloc+kernel) on my ubuntu 20.04.1 based WSL setup, no issues there, then figured "maybe it's the container?" so I reinstalled centos7 as a full VM with plenty of ram... [half a day of compiling stuff later].... same issue as the container!! (crash after 35mins, 2.1GB tops) Next idea: is it the build or the environment? Transfered dpcpp/igc/ocloc from centos7 to ubuntu 20.04.1 (needed some old ncurses lib before it ran, but nothing too crazy) kernel builds just fine (and yes, I've validated the right binaries were used) so the build of dpcpp/igc/ocloc on centos is fine, but it won't have a happy time running on centos next idea: maybe it's the memory allocator? lets swap it out! when preloading jemalloc centos 35->15 mins (but still crash) ubuntu 27 mins, no difference. Next idea: kinda iffy it tops out at ~2GB memory usage, maybe a process limit? wrote a quick app that just allocated tons of ram and initalizes it with some random values, no issues going over 2GB doing a debug build of igc now, hoping it'll be chattier in what is upsetting it, but open to other stuff you may think is useful trying.

Ray molenkamp commented

2022-05-26 23:18:39 +02:00

ocloc: /root/blender-git/build_linux/deps/build/igc/src/external_igc/visa/LoopAnalysis.cpp:29: vISA::G4_BB* vISA::ImmDominator::InterSect(vISA::G4_BB*, int, int): Assertion `finger1 == kernel.fg.getEntryBB() || finger2 == kernel.fg.getEntryBB()' failed.

alight, not what i was hoping for, given i have no desire to go debug ocloc but progress nonetheless

``` ocloc: /root/blender-git/build_linux/deps/build/igc/src/external_igc/visa/LoopAnalysis.cpp:29: vISA::G4_BB* vISA::ImmDominator::InterSect(vISA::G4_BB*, int, int): Assertion `finger1 == kernel.fg.getEntryBB() || finger2 == kernel.fg.getEntryBB()' failed. ``` alight, not what i was hoping for, given i have no desire to go debug ocloc but progress nonetheless

Xavier Hallade commented

2022-05-31 21:11:31 +02:00

I did check as well and ran into the same stacktrace when building from centOS 7 end-to-end. More complete version:

#2  0x00007ff9bd2521a6 in __assert_fail_base (fmt=0x7ff9bd3adce0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7ff9b65e25a8 "finger1 == kernel.fg.getEntryBB() || finger2 == kernel.fg.getEntryBB()",
    file=file@entry=0x7ff9b65e2560 "/root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp", line=line@entry=29,
    function=function@entry=0x7ff9b65e2518 "vISA::G4_BB* vISA::ImmDominator::InterSect(vISA::G4_BB*, int, int)") at assert.c:92
#3  0x00007ff9bd252252 in __GI___assert_fail (assertion=0x7ff9b65e25a8 "finger1 == kernel.fg.getEntryBB() || finger2 == kernel.fg.getEntryBB()",
    file=0x7ff9b65e2560 "/root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp", line=29, function=0x7ff9b65e2518 "vISA::G4_BB* vISA::ImmDominator::InterSect(vISA::G4_BB*, int, int)") at assert.c:101
- 4  0x00007ff9b33617cd in vISA::ImmDominator::InterSect (this=0x1668d140, bb=0x1e8c6da0, i=0, k=1) at /root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp:29
- 5  0x00007ff9b33622e3 in vISA::ImmDominator::runIDOM (this=0x1668d140) at /root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp:169
- 6  0x00007ff9b3362422 in vISA::ImmDominator::run (this=0x1668d140) at /root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp:196
- 7  0x00007ff9b3362821 in vISA::Analysis::recomputeIfStale (this=0x1668d140) at /root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp:269
- 8  0x00007ff9b3362453 in vISA::ImmDominator::dominates (this=0x1668d140, bb1=0x1e5e49f0, bb2=0x1e5e4cc0) at /root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp:202
- 9  0x00007ff9b32fca40 in vISA::G4_BB::dominates (this=0x1e5e49f0, other=0x1e5e4cc0) at /root/shared/sycl_workspace/igc_workspace/igc/visa/G4_BB.cpp:1660
- 10 0x00007ff9b35d8a32 in vISA::SpillManager::checkDefUseDomRel (this=0x7ffcff07d170, dst=0x174899d0, defBB=0x1e5e49f0) at /root/shared/sycl_workspace/igc_workspace/igc/visa/SpillCode.cpp:465
- 11 0x00007ff9b35d8db3 in vISA::SpillManager::<lambda(vISA::G4_BB*, vISA::G4_Operand*)>::operator()(vISA::G4_BB *, vISA::G4_Operand *) const (__closure=0x7ffcff07d038, bb=0x1e5e49f0, spilledRegion=0x174899d0)
    at /root/shared/sycl_workspace/igc_workspace/igc/visa/SpillCode.cpp:541
#12 0x00007ff9b35d8ec4 in vISA::SpillManager::<lambda(vISA::G4_BB*, vISA::G4_Operand*)>::operator()(vISA::G4_BB *, vISA::G4_Operand *) const (__closure=0x7ffcff07d020, bb=0x1e5e49f0, opnd=0x174899d0)
    at /root/shared/sycl_workspace/igc_workspace/igc/visa/SpillCode.cpp:566
- 13 0x00007ff9b35d91bd in vISA::SpillManager::updateRMWNeeded (this=0x7ffcff07d170) at /root/shared/sycl_workspace/igc_workspace/igc/visa/SpillCode.cpp:598
- 14 0x00007ff9b35d996d in vISA::SpillManager::insertSpillCode (this=0x7ffcff07d170) at /root/shared/sycl_workspace/igc_workspace/igc/visa/SpillCode.cpp:727
- 15 0x00007ff9b352fa28 in vISA::GlobalRA::flagRegAlloc (this=0x7ffcff07e520) at /root/shared/sycl_workspace/igc_workspace/igc/visa/GraphColor.cpp:10019
- 16 0x00007ff9b3530eba in vISA::GlobalRA::coloringRegAlloc (this=0x7ffcff07e520) at /root/shared/sycl_workspace/igc_workspace/igc/visa/GraphColor.cpp:10276
- 17 0x00007ff9b33ecbc6 in regAlloc (builder=..., regPool=..., kernel=...) at /root/shared/sycl_workspace/igc_workspace/igc/visa/RegAlloc.cpp:3673
- 18 0x00007ff9b337d318 in vISA::Optimizer::regAlloc (this=0x7ffcff07ebc0) at /root/shared/sycl_workspace/igc_workspace/igc/visa/Optimizer.cpp:169
#19 0x00007ff9b33825b8 in vISA::Optimizer::runPass (this=0x7ffcff07ebc0, Index=vISA::Optimizer::PI_regAlloc) at /root/shared/sycl_workspace/igc_workspace/igc/visa/Optimizer.cpp:1355

I also confirm compiler binaries built from CentOS 7 and used on ubuntu show no such issue.
This is very strange and I'll check internally but it may take time. To continue progressing I recommend to just disable binaries generation for Linux from your CentOS 7 systems until we find a proper fix. JIT is working well, gets cached, and compilation is faster than on Windows (~10min).

I did check as well and ran into the same stacktrace when building from centOS 7 end-to-end. More complete version: ``` #2 0x00007ff9bd2521a6 in __assert_fail_base (fmt=0x7ff9bd3adce0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7ff9b65e25a8 "finger1 == kernel.fg.getEntryBB() || finger2 == kernel.fg.getEntryBB()", file=file@entry=0x7ff9b65e2560 "/root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp", line=line@entry=29, function=function@entry=0x7ff9b65e2518 "vISA::G4_BB* vISA::ImmDominator::InterSect(vISA::G4_BB*, int, int)") at assert.c:92 #3 0x00007ff9bd252252 in __GI___assert_fail (assertion=0x7ff9b65e25a8 "finger1 == kernel.fg.getEntryBB() || finger2 == kernel.fg.getEntryBB()", file=0x7ff9b65e2560 "/root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp", line=29, function=0x7ff9b65e2518 "vISA::G4_BB* vISA::ImmDominator::InterSect(vISA::G4_BB*, int, int)") at assert.c:101 - 4 0x00007ff9b33617cd in vISA::ImmDominator::InterSect (this=0x1668d140, bb=0x1e8c6da0, i=0, k=1) at /root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp:29 - 5 0x00007ff9b33622e3 in vISA::ImmDominator::runIDOM (this=0x1668d140) at /root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp:169 - 6 0x00007ff9b3362422 in vISA::ImmDominator::run (this=0x1668d140) at /root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp:196 - 7 0x00007ff9b3362821 in vISA::Analysis::recomputeIfStale (this=0x1668d140) at /root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp:269 - 8 0x00007ff9b3362453 in vISA::ImmDominator::dominates (this=0x1668d140, bb1=0x1e5e49f0, bb2=0x1e5e4cc0) at /root/shared/sycl_workspace/igc_workspace/igc/visa/LoopAnalysis.cpp:202 - 9 0x00007ff9b32fca40 in vISA::G4_BB::dominates (this=0x1e5e49f0, other=0x1e5e4cc0) at /root/shared/sycl_workspace/igc_workspace/igc/visa/G4_BB.cpp:1660 - 10 0x00007ff9b35d8a32 in vISA::SpillManager::checkDefUseDomRel (this=0x7ffcff07d170, dst=0x174899d0, defBB=0x1e5e49f0) at /root/shared/sycl_workspace/igc_workspace/igc/visa/SpillCode.cpp:465 - 11 0x00007ff9b35d8db3 in vISA::SpillManager::<lambda(vISA::G4_BB*, vISA::G4_Operand*)>::operator()(vISA::G4_BB *, vISA::G4_Operand *) const (__closure=0x7ffcff07d038, bb=0x1e5e49f0, spilledRegion=0x174899d0) at /root/shared/sycl_workspace/igc_workspace/igc/visa/SpillCode.cpp:541 #12 0x00007ff9b35d8ec4 in vISA::SpillManager::<lambda(vISA::G4_BB*, vISA::G4_Operand*)>::operator()(vISA::G4_BB *, vISA::G4_Operand *) const (__closure=0x7ffcff07d020, bb=0x1e5e49f0, opnd=0x174899d0) at /root/shared/sycl_workspace/igc_workspace/igc/visa/SpillCode.cpp:566 - 13 0x00007ff9b35d91bd in vISA::SpillManager::updateRMWNeeded (this=0x7ffcff07d170) at /root/shared/sycl_workspace/igc_workspace/igc/visa/SpillCode.cpp:598 - 14 0x00007ff9b35d996d in vISA::SpillManager::insertSpillCode (this=0x7ffcff07d170) at /root/shared/sycl_workspace/igc_workspace/igc/visa/SpillCode.cpp:727 - 15 0x00007ff9b352fa28 in vISA::GlobalRA::flagRegAlloc (this=0x7ffcff07e520) at /root/shared/sycl_workspace/igc_workspace/igc/visa/GraphColor.cpp:10019 - 16 0x00007ff9b3530eba in vISA::GlobalRA::coloringRegAlloc (this=0x7ffcff07e520) at /root/shared/sycl_workspace/igc_workspace/igc/visa/GraphColor.cpp:10276 - 17 0x00007ff9b33ecbc6 in regAlloc (builder=..., regPool=..., kernel=...) at /root/shared/sycl_workspace/igc_workspace/igc/visa/RegAlloc.cpp:3673 - 18 0x00007ff9b337d318 in vISA::Optimizer::regAlloc (this=0x7ffcff07ebc0) at /root/shared/sycl_workspace/igc_workspace/igc/visa/Optimizer.cpp:169 #19 0x00007ff9b33825b8 in vISA::Optimizer::runPass (this=0x7ffcff07ebc0, Index=vISA::Optimizer::PI_regAlloc) at /root/shared/sycl_workspace/igc_workspace/igc/visa/Optimizer.cpp:1355 ``` I also confirm compiler binaries built from CentOS 7 and used on ubuntu show no such issue. This is very strange and I'll check internally but it may take time. To continue progressing I recommend to just disable binaries generation for Linux from your CentOS 7 systems until we find a proper fix. JIT is working well, gets cached, and compilation is faster than on Windows (~10min).

Ray molenkamp commented

2022-05-31 21:25:41 +02:00

Thanks for confirming my results, always good to know it's not just me :)

Xavier Hallade commented

2022-06-01 21:21:12 +02:00

while building a reproducer for the graphics compiler team (issue is narrowed down to the intersection kernels), I ran into a great workaround which is to avoid using -ffast-math from CentOS7. From my testing on Windows, the impact of not using ffast-math is minimal (few %).

Ray molenkamp commented

2022-06-02 03:01:08 +02:00

Can confirm, i can successfully build a kernel on centos now

Ray molenkamp commented

2022-06-02 18:06:09 +02:00

@xavierh i see you quietly switched to sycl 20220529.

If you want to switch versions, that's fine, but rather than updating the build instructions above, please update(and test before committing) build_files/build_environment/cmake/versions.cmake this is what we'll be using to build the compilers. Be sure to keep an eye on the preferred versions of deps of dpcc they may also need to be changed when you change dpcpp version.

Or just mention that you want a version change and i'll do the work, (but this shouldn't ideally happen too often, this whole thing is kind of a time vampire)

@xavierh i see you quietly switched to sycl 20220529. If you want to switch versions, that's fine, but rather than updating the build instructions above, please update(and test before committing) [build_files/build_environment/cmake/versions.cmake](https://developer.blender.org/diffusion/B/browse/cycles_oneapi/build_files/build_environment/cmake/versions.cmake) this is what we'll be using to build the compilers. Be sure to keep an eye on the preferred versions of deps of dpcc they may also need to be changed when you change dpcpp version. Or just mention that you want a version change and i'll do the work, (but this shouldn't ideally happen too often, this whole thing is kind of a time vampire)

Xavier Hallade commented

2022-06-02 18:17:53 +02:00

you didn't give me the time to get loud about it :) current deps are good with this version of sycl as well, I'll just update the cmake file

Max commented

2022-06-26 23:14:22 +02:00

Added subscriber: @manchakkay

blender-admin commented

2022-06-29 12:58:04 +02:00

This issue was referenced by a02992f131

This issue was referenced by a02992f1313811c9905e44dc95a0aee31d707f67

Yuro commented

2022-06-29 14:34:09 +02:00

Added subscriber: @Yuro

Philip Walls commented

2022-07-06 02:51:49 +02:00

Added subscriber: @pawalls

Nikita Sirgienko commented

2022-08-19 17:03:26 +02:00

@LazyDodo @Sergey, It make sense to update used SYCL compiler version, in order to get some fixes in JIT caching.
And I think, that this is worth to pick up this version: https://github.com/intel/llvm/releases/tag/sycl-nightly%2F20220812
I have checked this version and it have worked fine to me on Linux and Windows.
And also, some positive news I think - you can still build it on your own, but in the same time you can just grab prebuilt files for both Linux and Windows from this page.

@LazyDodo @Sergey, It make sense to update used SYCL compiler version, in order to get some fixes in JIT caching. And I think, that this is worth to pick up this version: https://github.com/intel/llvm/releases/tag/sycl-nightly%2F20220812 I have checked this version and it have worked fine to me on Linux and Windows. And also, some positive news I think - you can still build it on your own, but in the same time you can just grab prebuilt files for both Linux and Windows from this page.

Sergey Sharybin commented

2022-08-22 14:00:15 +02:00

@Sirgienko, Got it. Will see what I can do.

Unfortunately, the AoT on Windows is not addressed yet. Even after migration to a much faster hardware it took more than an hour to compile the AoT kernel (I can't say exact time since SSH session got closed after 1 hour of "inactivity").

On Linux on the same hardware compilation the AoT oneAPI takes 30min which is unideal, but manageable.

@Sirgienko, @xavierh Is there an update of Windows ocloc which brings its performance closer to Linux? :)

@Sirgienko, Got it. Will see what I can do. Unfortunately, the AoT on Windows is not addressed yet. Even after migration to a much faster hardware it took more than an hour to compile the AoT kernel (I can't say exact time since SSH session got closed after 1 hour of "inactivity"). On Linux on the same hardware compilation the AoT oneAPI takes 30min which is unideal, but manageable. @Sirgienko, @xavierh Is there an update of Windows ocloc which brings its performance closer to Linux? :)

Nikita Sirgienko commented

2022-08-23 10:14:23 +02:00

@Sergey, for Windows you can try to see if this version https://registrationcenter-download.intel.com/akdlm/irc_nas/18809/ocloc_win_dgpu_101.1743.zip works faster (this version from begin of the July).
But this version seems don't supporting some options, so I will provide better version later.

@Sergey, for Windows you can try to see if this version https://registrationcenter-download.intel.com/akdlm/irc_nas/18809/ocloc_win_dgpu_101.1743.zip works faster (this version from begin of the July). But this version seems don't supporting some options, so I will provide better version later.

Sergey Sharybin commented

2022-08-23 12:44:58 +02:00

@Sirgienko Just gave it a whirl. It doesn't support the --format flag: https://builder.blender.org/admin/#/builders/30/builds/6404/steps/7/logs/stdio

@Sirgienko Just gave it a whirl. It doesn't support the `--format` flag: https://builder.blender.org/admin/#/builders/30/builds/6404/steps/7/logs/stdio

Nikita Sirgienko commented

2022-08-23 12:54:39 +02:00

@Sergey, Yes, I know and I am on this topic (and we probably give your proper GPU compiler version under NDA).
Meanwhile, this timings not really expected - they should be twice lower, I would say. Are you using Virtual Machines for this? If so, then do you sure, that it is configured properly - Is Turbo Boost enabled, for exampled?

@Sergey, Yes, I know and I am on this topic (and we probably give your proper GPU compiler version under NDA). Meanwhile, this timings not really expected - they should be twice lower, I would say. Are you using Virtual Machines for this? If so, then do you sure, that it is configured properly - Is Turbo Boost enabled, for exampled?

Nikita Sirgienko commented

2022-09-21 15:04:17 +02:00

@Sergey, it maybe make sense to update description in the wiki (https://wiki.blender.org/wiki/Building_Blender/GPU_Binaries) due adding of AoT support (it is mentioned there as not added yet)?