Buildbot: Upgrade to the latest HIPRT Compiler on Windows #84

Closed
opened 2024-05-30 22:59:37 +02:00 by Bart van der Braak · 5 comments

Current Version:

New Versions Available:

Issue Summary:

We need to upgrade from hiprtsdk-2.0.3a134c7 to one of the newer versions listed above. The newer versions adhere to a different package structure, which may require adjustments to our build process, specifically the FIND commands in CMAKE.

Background:

In a recent upgrade (#81), we added a new version of HIP (hip_sdk_5.7.32000) for Windows. However, testing this upgrade in GPU compilation resulted in errors that are potentially related to the outdated HIPRT version.

The following error was encountered during a pipeline run (build #5126):

FAILED: intern/cycles/kernel/kernel_rt_gfx.hipfb 
lld: error: linking module flags 'amdgpu_code_object_version': IDs have conflicting values in 'C:\Users\blender\AppData\Local\Temp\hiprt02000_amd_lib_win-gfx1010-b25407.bc' and 'ld-temp.o'
clang++: error: amdgcn-link command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.

Action Items:

  1. Upgrade HIPRT to the latest version: Determine the most suitable version from the available options and proceed with the upgrade.
  2. Adjust CMAKE configuration: Modify the FIND commands in CMAKE to support the new package structure of the updated HIPRT version.
  3. Resolve Compilation Errors: Investigate and resolve the linking error encountered during GPU compilation, ensuring compatibility with hip_sdk_5.7.32000.
**Current Version:** - v2.0.3a134c7 (May 2023): [hiprtSdk-2.0.3a134c7.zip](https://gpuopen.com/download/hiprt/hiprtSdk-2.0.3a134c7.zip) **New Versions Available:** - v2.2.0e68f54 (December 2023): [hiprtSdk-2.2.0e68f54.zip](https://gpuopen.com/download/hiprt/hiprtSdk-2.2.0e68f54.zip) - v2.1.c202dac (November 2023): [hiprtSdk-2.1.c202dac.zip](https://gpuopen.com/download/hiprt/hiprtSdk-2.1.c202dac.zip) - v2.1.6fc8ff0 (September 2023): [hiprtSdk-2.1.6fc8ff0.zip](https://gpuopen.com/download/hiprt/hiprtSdk-2.1.6fc8ff0.zip) #### Issue Summary: We need to upgrade from `hiprtsdk-2.0.3a134c7` to one of the newer versions listed above. The newer versions adhere to a different package structure, which may require adjustments to our build process, specifically the FIND commands in CMAKE. #### Background: In a recent upgrade (#81), we added a new version of HIP (`hip_sdk_5.7.32000`) for Windows. However, testing this upgrade in GPU compilation resulted in errors that are potentially related to the outdated HIPRT version. The following error was encountered during a pipeline run ([build #5126](https://builder.blender.org/admin/#/builders/133/builds/5126/steps/4/logs/stdio)): ``` FAILED: intern/cycles/kernel/kernel_rt_gfx.hipfb lld: error: linking module flags 'amdgpu_code_object_version': IDs have conflicting values in 'C:\Users\blender\AppData\Local\Temp\hiprt02000_amd_lib_win-gfx1010-b25407.bc' and 'ld-temp.o' clang++: error: amdgcn-link command failed with exit code 1 (use -v to see invocation) ninja: build stopped: subcommand failed. ``` #### Action Items: 1. [ ] **Upgrade HIPRT to the latest version:** Determine the most suitable version from the available options and proceed with the upgrade. 2. [ ] **Adjust CMAKE configuration:** Modify the FIND commands in CMAKE to support the new package structure of the updated HIPRT version. 3. [ ] **Resolve Compilation Errors:** Investigate and resolve the linking error encountered during GPU compilation, ensuring compatibility with `hip_sdk_5.7.32000`.
Bart van der Braak added the
Service
Buildbot
label 2024-05-30 22:59:44 +02:00
Author
Owner

I don't know how we exactly got the current hiprtsdk-2.0.3a134c7.zip package, however, it seems to adhere to a different package and naming structure than the ones we can download officially from AMD. This makes updating a bit more tricky, since it either forces us to repackage according to current standards, which is probably not as future proof, or we have to change our CMAKE FIND commands to be compatible with the new structure.

I excluded parts of the structure that weren't important

To illustrate, our current package (hiprtsdk-2.0.3a134c7.zip), has the following structure when unzipped:

hiprtsdk-2.0.3a134c7
├── hiprt2.0.3a134c7
│   ├── dist
│   │   └── bin
│   │       └── Release
│   │           ├── embree3.dll
│   │           ├── hiprt0200064.dll
│   │           ├── hiprt0200064.lib
│   │           ├── hiprt02000_amd.hipfb
│   │           ├── hiprt02000_amd_lib_win.bc
│   │           ├── hiprt02000_nv.fatbin
│   │           ├── hiprt02000_nv_lib.fatbin
│   │           ├── oro_compiled_kernels.fatbin
│   │           ├── oro_compiled_kernels.hipfb
│   │           └── tbb12.dll
│   ├── hiprt
│   │   ├── hiprt_common.h
│   │   ├── hiprt_device.h
│   │   ├── hiprt.h
│   │   ├── hiprt_types.h
│   │   └── hiprt_vec.h

In build_files/cmake/Modules/FindHIPRT.cmake we try to find hiprt02000_amd_lib_win.bc (BITCODE).

But these need to be adjusted for the new packages (which also have a different file name, namely, hiprtSdk-2.2.0e68f54.zip), which look as follows:

hiprtSdk-2.2.0e68f54
├── hiprt
│   ├── buildID_linux.txt
│   ├── buildID_win.txt
│   ├── hiprt_common.h
│   ├── hiprt_device.h
│   ├── hiprtew.h
│   ├── hiprt.h
│   ├── hiprt_types.h
│   ├── hiprt_vec.h
│   ├── linux64
│   │   ├── hiprt02002_5.7_amd.hipfb
│   │   ├── hiprt02002_5.7_amd_lib_linux.bc
│   │   ├── hiprt02002_nv.fatbin
│   │   ├── hiprt02002_nv_lib.fatbin
│   │   ├── libhiprt0200264.so
│   │   ├── oro_compiled_kernels.fatbin
│   │   └── oro_compiled_kernels.hipfb
│   ├── README.md
│   └── win
│       ├── amd_comgr0507.dll
│       ├── hiprt02002_5.7_amd.hipfb
│       ├── hiprt02002_5.7_amd_lib_win.bc
│       ├── hiprt0200264.dll
│       ├── hiprt0200264.lib
│       ├── hiprt02002_nv.fatbin
│       ├── hiprt02002_nv_lib.fatbin
│       ├── hiprtc0507.dll
│       ├── hiprtc-builtins0507.dll
│       ├── oro_compiled_kernels.fatbin
│       └── oro_compiled_kernels.hipfb

As you can see, we go from trying to find:

  • hiprtsdk-2.0.3a134c7/hiprt2.0.3a134c7/dist/bin/Release/hiprt02000_amd_lib_win.bc
    to
  • hiprtSdk-2.2.0e68f54/hiprt/win/hiprt02002_5.7_amd_lib_win.bc

And the name of that file also now includes a reference to the HIP version (5.7)

I don't know how we exactly got the current `hiprtsdk-2.0.3a134c7.zip` package, however, it seems to adhere to a different package and naming structure than the ones we can download officially from AMD. This makes updating a bit more tricky, since it either forces us to repackage according to current standards, which is probably not as future proof, or we have to change our CMAKE FIND commands to be compatible with the new structure. _I excluded parts of the structure that weren't important_ To illustrate, our current package (hiprtsdk-2.0.3a134c7.zip), has the following structure when unzipped: ``` hiprtsdk-2.0.3a134c7 ├── hiprt2.0.3a134c7 │   ├── dist │   │   └── bin │   │   └── Release │   │   ├── embree3.dll │   │   ├── hiprt0200064.dll │   │   ├── hiprt0200064.lib │   │   ├── hiprt02000_amd.hipfb │   │   ├── hiprt02000_amd_lib_win.bc │   │   ├── hiprt02000_nv.fatbin │   │   ├── hiprt02000_nv_lib.fatbin │   │   ├── oro_compiled_kernels.fatbin │   │   ├── oro_compiled_kernels.hipfb │   │   └── tbb12.dll │   ├── hiprt │   │   ├── hiprt_common.h │   │   ├── hiprt_device.h │   │   ├── hiprt.h │   │   ├── hiprt_types.h │   │   └── hiprt_vec.h ``` In `build_files/cmake/Modules/FindHIPRT.cmake` we try to find `hiprt02000_amd_lib_win.bc` (BITCODE). But these need to be adjusted for the new packages (which also have a different file name, namely, `hiprtSdk-2.2.0e68f54.zip`), which look as follows: ``` hiprtSdk-2.2.0e68f54 ├── hiprt │   ├── buildID_linux.txt │   ├── buildID_win.txt │   ├── hiprt_common.h │   ├── hiprt_device.h │   ├── hiprtew.h │   ├── hiprt.h │   ├── hiprt_types.h │   ├── hiprt_vec.h │   ├── linux64 │   │   ├── hiprt02002_5.7_amd.hipfb │   │   ├── hiprt02002_5.7_amd_lib_linux.bc │   │   ├── hiprt02002_nv.fatbin │   │   ├── hiprt02002_nv_lib.fatbin │   │   ├── libhiprt0200264.so │   │   ├── oro_compiled_kernels.fatbin │   │   └── oro_compiled_kernels.hipfb │   ├── README.md │   └── win │   ├── amd_comgr0507.dll │   ├── hiprt02002_5.7_amd.hipfb │   ├── hiprt02002_5.7_amd_lib_win.bc │   ├── hiprt0200264.dll │   ├── hiprt0200264.lib │   ├── hiprt02002_nv.fatbin │   ├── hiprt02002_nv_lib.fatbin │   ├── hiprtc0507.dll │   ├── hiprtc-builtins0507.dll │   ├── oro_compiled_kernels.fatbin │   └── oro_compiled_kernels.hipfb ``` As you can see, we go from trying to find: - `hiprtsdk-2.0.3a134c7/hiprt2.0.3a134c7/dist/bin/Release/hiprt02000_amd_lib_win.bc` to - `hiprtSdk-2.2.0e68f54/hiprt/win/hiprt02002_5.7_amd_lib_win.bc` And the name of that file also now includes a reference to the HIP version (5.7)
Author
Owner

I am trying to resolve these issues within the following PR, currently only used for testing and debugging:
blender/blender#122393

I am trying to resolve these issues within the following PR, currently only used for testing and debugging: https://projects.blender.org/blender/blender/pulls/122393
Author
Owner

I got confirmation that new versions of HIPRT do not need to be deployed to the Buildbot workers.

I got confirmation that new versions of HIPRT do not need to be deployed to the Buildbot workers.
Author
Owner

I have repackaged the archive AMD sent us to align with our previous versions of HIP-RT, such that we don't have to alter our FIND_HIPRT in CMAKE.

It's currently deployed to one of our UATEST workers where I'm running a test release based on PR #123306, a slightly modified version of your STX_Support branch:

When this build finishes without issues I will continue deploying this version to other workers.

I have repackaged the archive AMD sent us to align with our previous versions of HIP-RT, such that we don't have to alter our `FIND_HIPRT` in `CMAKE`. It's currently deployed to one of our UATEST workers where I'm running a test release based on PR [#123306](https://projects.blender.org/blender/blender/pulls/123306), a slightly modified version of your `STX_Support` branch: - https://builder-uatest.blender.org/admin/#/builders/64/builds/48 When this build finishes without issues I will continue deploying this version to other workers.
Bart van der Braak self-assigned this 2024-06-17 14:12:30 +02:00
Author
Owner

hiprtsdk-2.0.e1ff193 was deployed to the latest buildbot workers.

`hiprtsdk-2.0.e1ff193` was deployed to the latest buildbot workers.
Bart van der Braak added this to the DevOps Progress Board project 2024-07-16 13:00:10 +02:00
Bart van der Braak changed title from Upgrade to the latest HIPRT Compiler on Windows to Buildbot: Upgrade to the latest HIPRT Compiler on Windows 2024-07-17 15:16:05 +02:00
Sign in to join this conversation.
No description provided.