Build: Upgrade DPC++ (5.2) and Embree (4.3.2) #122242

Merged
Xavier Hallade merged 3 commits from xavierh/blender:sycl_update into main 2024-06-04 18:26:25 +02:00
Member

https://github.com/intel/llvm now offers maintained branches corresponding to Intel oneAPI releases.
sycl-rel_5_2_0 corresponds to 2024.1.

Rather than pulling new dependencies for boost unordered map, I've patched that part out to use the std one.

The older version of Embree doesn't compile out of the box with it so I'm also upgrading Embree to its latest version, that is compatible.

https://github.com/intel/llvm now offers maintained branches corresponding to Intel oneAPI releases. sycl-rel_5_2_0 corresponds to 2024.1. Rather than pulling new dependencies for boost unordered map, I've patched that part out to use the std one. The older version of Embree doesn't compile out of the box with it so I'm also upgrading Embree to its latest version, that is compatible.
Xavier Hallade added this to the 4.2 LTS milestone 2024-05-24 22:56:55 +02:00
Xavier Hallade force-pushed sycl_update from 2755bb8f74 to 23e0d01230 2024-05-24 23:58:56 +02:00 Compare
Xavier Hallade force-pushed sycl_update from 23e0d01230 to 6cbf2aec4a 2024-05-25 00:20:53 +02:00 Compare
Xavier Hallade force-pushed sycl_update from 6cbf2aec4a to 2df3be39ed 2024-05-25 10:29:12 +02:00 Compare
Xavier Hallade force-pushed sycl_update from 2df3be39ed to 91f2fbc113 2024-05-27 10:39:43 +02:00 Compare
Xavier Hallade force-pushed sycl_update from 91f2fbc113 to 570f53e59c 2024-05-27 11:43:41 +02:00 Compare
Xavier Hallade changed title from WIP: Build: Upgrade DPC++ compiler to 5.2 release to WIP: Build: Upgrade DPC++ (5.2) and Embree (4.3.2) 2024-05-27 12:44:00 +02:00
Xavier Hallade force-pushed sycl_update from b693f319a2 to c304e3e4ba 2024-05-27 13:52:32 +02:00 Compare
Xavier Hallade added this to the Platforms, Builds Tests & Devices project 2024-05-27 18:20:55 +02:00
Xavier Hallade added the
Interest
Render & Cycles
label 2024-05-27 18:21:05 +02:00
Xavier Hallade changed title from WIP: Build: Upgrade DPC++ (5.2) and Embree (4.3.2) to Build: Upgrade DPC++ (5.2) and Embree (4.3.2) 2024-05-28 08:44:13 +02:00
Xavier Hallade requested review from Ray molenkamp 2024-05-28 08:44:31 +02:00
Member

We're still building with vs2019 (v16.9.16) and it's having some template errors when building dpcpp

We're still building with vs2019 (v16.9.16) and it's having some template errors when building dpcpp
Author
Member

I've validated on my end with VS 2019 but 16.11.34, is it possible for you to upgrade ?

I've validated on my end with VS 2019 but 16.11.34, is it possible for you to upgrade ?
Member

rats, I really hoped i wouldn't need to, i can bump to 16.11.26 without disrupting the CI infra, fingers crossed it's close enough.

rats, I really hoped i wouldn't need to, i can bump to 16.11.26 without disrupting the CI infra, fingers crossed it's close enough.
Anthony Roberts requested changes 2024-06-03 20:03:21 +02:00
Dismissed
Anthony Roberts left a comment
Member

This breaks Windows ARM64 platforms - looks like the version of sse2neon used (1.6.0) is broken (missing https://github.com/DLTcollab/sse2neon/pull/588).

Any chance of embree moving to 1.7.0 (or newer, which has some other fixes)?

If not, I'll have to introduce a patch (that PR, basically) that fixes compilation.

This breaks Windows ARM64 platforms - looks like the version of sse2neon used (1.6.0) is broken (missing https://github.com/DLTcollab/sse2neon/pull/588). Any chance of embree moving to 1.7.0 (or newer, which has some other fixes)? If not, I'll have to introduce a patch (that PR, basically) that fixes compilation.
Ray molenkamp requested review from Campbell Barton 2024-06-03 20:03:34 +02:00
Ray molenkamp requested review from Raul Fernandez Hernandez 2024-06-03 20:03:34 +02:00
Ray molenkamp requested review from Anthony Roberts 2024-06-03 20:03:35 +02:00
Ray molenkamp approved these changes 2024-06-03 20:04:10 +02:00
Ray molenkamp left a comment
Member

after upgrading msvc, no issues, hold off landing until the other platform devs have taken a look

after upgrading msvc, no issues, hold off landing until the other platform devs have taken a look

Attached replacement diff file if an sse2neon version upgrade is not feasible

Attached replacement diff file if an sse2neon version upgrade is not feasible
Member

@Anthony-Roberts could we do something like this?

--- a/build_files/build_environment/cmake/embree_windows_arm.cmake
+++ b/build_files/build_environment/cmake/embree_windows_arm.cmake
@@ -77,7 +77,8 @@ set(EMBREE_PATCH_COMMAND
   COMMAND ${CMAKE_COMMAND} -E copy
     ${BUILD_DIR}/embree_Directory.Build.Props_temp
     ${BUILD_DIR}/embree/src/external_embree-build/Directory.Build.Props &&
-    ${PATCH_CMD} -p 1 -d ${BUILD_DIR}/embree/src/external_embree < ${PATCH_DIR}/embree.diff
+    ${PATCH_CMD} -p 1 -d ${BUILD_DIR}/embree/src/external_embree < ${PATCH_DIR}/embree.diff &&
+    ${CMAKE_COMMAND} -E copy  ${LIBDIR}/sse2neon/theheader.h  ${BUILD_DIR}/where/it/needs/to/be/theheader.h
 )

 # This all only works if we use the VS generator (with `clangcl` toolset), so switch back to that

oh also add sse2neon in the add_dependencies section

@Anthony-Roberts could we do something like this? ``` --- a/build_files/build_environment/cmake/embree_windows_arm.cmake +++ b/build_files/build_environment/cmake/embree_windows_arm.cmake @@ -77,7 +77,8 @@ set(EMBREE_PATCH_COMMAND COMMAND ${CMAKE_COMMAND} -E copy ${BUILD_DIR}/embree_Directory.Build.Props_temp ${BUILD_DIR}/embree/src/external_embree-build/Directory.Build.Props && - ${PATCH_CMD} -p 1 -d ${BUILD_DIR}/embree/src/external_embree < ${PATCH_DIR}/embree.diff + ${PATCH_CMD} -p 1 -d ${BUILD_DIR}/embree/src/external_embree < ${PATCH_DIR}/embree.diff && + ${CMAKE_COMMAND} -E copy ${LIBDIR}/sse2neon/theheader.h ${BUILD_DIR}/where/it/needs/to/be/theheader.h ) # This all only works if we use the VS generator (with `clangcl` toolset), so switch back to that ``` oh also add sse2neon in the `add_dependencies` section

Possibly, if the sse2neon header is vanilla, and included as-is and unchanged by embree (no idea if it is or not)

Possibly, if the sse2neon header is vanilla, and included as-is and unchanged by embree (no idea if it is or not)
Raul Fernandez Hernandez approved these changes 2024-06-03 23:26:28 +02:00
Member

Dependencies successfully built and installed on MacOS ARM

Dependencies successfully built and installed on MacOS ARM
Author
Member

@Anthony-Roberts having Embree upgrade sse2neon is doable but certainly not on a quick enough timeline for bcon3. It's safer to use your new diff that backports the fix.
I have checked the differences between this header from embree and the vanilla 1.6.0 one, It's not much but it's still something:

diff --git a/sse2neon_1.6.0.h b/sse2neon_1.6.0_embree.h
index 0db4805..b18d41e 100644
--- a/sse2neon_1.6.0.h
+++ b/sse2neon_1.6.0_embree.h
@@ -244,6 +244,14 @@ FORCE_INLINE void _sse2neon_smp_mb(void)
  * argument "a" of mm_shuffle_ps that will be places in fp1 of result.
  * fp0 is the same for fp0 of result.
  */
+#if defined(__aarch64__)
+#define _MN_SHUFFLE(fp3,fp2,fp1,fp0) ( (uint8x16_t){ (((fp3)*4)+0), (((fp3)*4)+1), (((fp3)*4)+2), (((fp3)*4)+3),  (((fp2)*4)+0), (((fp2)*4)+1), (((fp2)*4)+\
+2), (((fp2)*4)+3),  (((fp1)*4)+0), (((fp1)*4)+1), (((fp1)*4)+2), (((fp1)*4)+3),  (((fp0)*4)+0), (((fp0)*4)+1), (((fp0)*4)+2), (((fp0)*4)+3) } )
+#define _MF_SHUFFLE(fp3,fp2,fp1,fp0) ( (uint8x16_t){ (((fp3)*4)+0), (((fp3)*4)+1), (((fp3)*4)+2), (((fp3)*4)+3),  (((fp2)*4)+0), (((fp2)*4)+1), (((fp2)*4)+\
+2), (((fp2)*4)+3),  (((fp1)*4)+16+0), (((fp1)*4)+16+1), (((fp1)*4)+16+2), (((fp1)*4)+16+3),  (((fp0)*4)+16+0), (((fp0)*4)+16+1), (((fp0)*4)+16+2), (((fp0)*\
+4)+16+3) } )
+#endif
+
 #define _MM_SHUFFLE(fp3, fp2, fp1, fp0) \
     (((fp3) << 6) | ((fp2) << 4) | ((fp1) << 2) | ((fp0)))

@@ -2946,7 +2954,7 @@ FORCE_INLINE void _mm_stream_pi(__m64 *p, __m64 a)
 FORCE_INLINE void _mm_stream_ps(float *p, __m128 a)
 {
 #if __has_builtin(__builtin_nontemporal_store)
-    __builtin_nontemporal_store(a, (float32x4_t *) p);
+    __builtin_nontemporal_store(reinterpret_cast<float32x4_t>(a), (float32x4_t *) p);
 #else
     vst1q_f32(p, vreinterpretq_f32_m128(a));
 #endif
@@ -6200,7 +6208,7 @@ FORCE_INLINE void _mm_storeu_si32(void *p, __m128i a)
 FORCE_INLINE void _mm_stream_pd(double *p, __m128d a)
 {
 #if __has_builtin(__builtin_nontemporal_store)
-    __builtin_nontemporal_store(a, (float32x4_t *) p);
+    __builtin_nontemporal_store(reinterpret_cast<float32x4_t>(a), (float32x4_t *) p);
 #elif defined(__aarch64__)
     vst1q_f64(p, vreinterpretq_f64_m128d(a));
 #else
@@ -7549,14 +7557,16 @@ FORCE_INLINE __m64 _mm_sign_pi8(__m64 _a, __m64 _b)
 //                                      __constrange(0,255) int imm)
 #define _mm_blend_epi16(a, b, imm)                                            \
     __extension__({                                                           \
-        const uint16_t _mask[8] = {((imm) & (1 << 0)) ? (uint16_t) -1 : 0x0,  \
-                                   ((imm) & (1 << 1)) ? (uint16_t) -1 : 0x0,  \
-                                   ((imm) & (1 << 2)) ? (uint16_t) -1 : 0x0,  \
-                                   ((imm) & (1 << 3)) ? (uint16_t) -1 : 0x0,  \
-                                   ((imm) & (1 << 4)) ? (uint16_t) -1 : 0x0,  \
-                                   ((imm) & (1 << 5)) ? (uint16_t) -1 : 0x0,  \
-                                   ((imm) & (1 << 6)) ? (uint16_t) -1 : 0x0,  \
-                                   ((imm) & (1 << 7)) ? (uint16_t) -1 : 0x0}; \
+        const uint16_t ones = 0xffff;                                         \
+        const uint16_t zeros = 0x0000;                                        \
+        const uint16_t _mask[8] = {((imm) & (1 << 0)) ? ones : zeros,         \
+                                   ((imm) & (1 << 1)) ? ones : zeros,         \
+                                   ((imm) & (1 << 2)) ? ones : zeros,         \
+                                   ((imm) & (1 << 3)) ? ones : zeros,         \
+                                   ((imm) & (1 << 4)) ? ones : zeros,         \
+                                   ((imm) & (1 << 5)) ? ones : zeros,         \
+                                   ((imm) & (1 << 6)) ? ones : zeros,         \
+                                   ((imm) & (1 << 7)) ? ones : zeros};        \
         uint16x8_t _mask_vec = vld1q_u16(_mask);                              \
         uint16x8_t _a = vreinterpretq_u16_m128i(a);                           \
         uint16x8_t _b = vreinterpretq_u16_m128i(b);                           \

@Anthony-Roberts having Embree upgrade sse2neon is doable but certainly not on a quick enough timeline for bcon3. It's safer to use your new diff that backports the fix. I have checked the differences between this header from embree and the vanilla 1.6.0 one, It's not much but it's still something: ``` diff --git a/sse2neon_1.6.0.h b/sse2neon_1.6.0_embree.h index 0db4805..b18d41e 100644 --- a/sse2neon_1.6.0.h +++ b/sse2neon_1.6.0_embree.h @@ -244,6 +244,14 @@ FORCE_INLINE void _sse2neon_smp_mb(void) * argument "a" of mm_shuffle_ps that will be places in fp1 of result. * fp0 is the same for fp0 of result. */ +#if defined(__aarch64__) +#define _MN_SHUFFLE(fp3,fp2,fp1,fp0) ( (uint8x16_t){ (((fp3)*4)+0), (((fp3)*4)+1), (((fp3)*4)+2), (((fp3)*4)+3), (((fp2)*4)+0), (((fp2)*4)+1), (((fp2)*4)+\ +2), (((fp2)*4)+3), (((fp1)*4)+0), (((fp1)*4)+1), (((fp1)*4)+2), (((fp1)*4)+3), (((fp0)*4)+0), (((fp0)*4)+1), (((fp0)*4)+2), (((fp0)*4)+3) } ) +#define _MF_SHUFFLE(fp3,fp2,fp1,fp0) ( (uint8x16_t){ (((fp3)*4)+0), (((fp3)*4)+1), (((fp3)*4)+2), (((fp3)*4)+3), (((fp2)*4)+0), (((fp2)*4)+1), (((fp2)*4)+\ +2), (((fp2)*4)+3), (((fp1)*4)+16+0), (((fp1)*4)+16+1), (((fp1)*4)+16+2), (((fp1)*4)+16+3), (((fp0)*4)+16+0), (((fp0)*4)+16+1), (((fp0)*4)+16+2), (((fp0)*\ +4)+16+3) } ) +#endif + #define _MM_SHUFFLE(fp3, fp2, fp1, fp0) \ (((fp3) << 6) | ((fp2) << 4) | ((fp1) << 2) | ((fp0))) @@ -2946,7 +2954,7 @@ FORCE_INLINE void _mm_stream_pi(__m64 *p, __m64 a) FORCE_INLINE void _mm_stream_ps(float *p, __m128 a) { #if __has_builtin(__builtin_nontemporal_store) - __builtin_nontemporal_store(a, (float32x4_t *) p); + __builtin_nontemporal_store(reinterpret_cast<float32x4_t>(a), (float32x4_t *) p); #else vst1q_f32(p, vreinterpretq_f32_m128(a)); #endif @@ -6200,7 +6208,7 @@ FORCE_INLINE void _mm_storeu_si32(void *p, __m128i a) FORCE_INLINE void _mm_stream_pd(double *p, __m128d a) { #if __has_builtin(__builtin_nontemporal_store) - __builtin_nontemporal_store(a, (float32x4_t *) p); + __builtin_nontemporal_store(reinterpret_cast<float32x4_t>(a), (float32x4_t *) p); #elif defined(__aarch64__) vst1q_f64(p, vreinterpretq_f64_m128d(a)); #else @@ -7549,14 +7557,16 @@ FORCE_INLINE __m64 _mm_sign_pi8(__m64 _a, __m64 _b) // __constrange(0,255) int imm) #define _mm_blend_epi16(a, b, imm) \ __extension__({ \ - const uint16_t _mask[8] = {((imm) & (1 << 0)) ? (uint16_t) -1 : 0x0, \ - ((imm) & (1 << 1)) ? (uint16_t) -1 : 0x0, \ - ((imm) & (1 << 2)) ? (uint16_t) -1 : 0x0, \ - ((imm) & (1 << 3)) ? (uint16_t) -1 : 0x0, \ - ((imm) & (1 << 4)) ? (uint16_t) -1 : 0x0, \ - ((imm) & (1 << 5)) ? (uint16_t) -1 : 0x0, \ - ((imm) & (1 << 6)) ? (uint16_t) -1 : 0x0, \ - ((imm) & (1 << 7)) ? (uint16_t) -1 : 0x0}; \ + const uint16_t ones = 0xffff; \ + const uint16_t zeros = 0x0000; \ + const uint16_t _mask[8] = {((imm) & (1 << 0)) ? ones : zeros, \ + ((imm) & (1 << 1)) ? ones : zeros, \ + ((imm) & (1 << 2)) ? ones : zeros, \ + ((imm) & (1 << 3)) ? ones : zeros, \ + ((imm) & (1 << 4)) ? ones : zeros, \ + ((imm) & (1 << 5)) ? ones : zeros, \ + ((imm) & (1 << 6)) ? ones : zeros, \ + ((imm) & (1 << 7)) ? ones : zeros}; \ uint16x8_t _mask_vec = vld1q_u16(_mask); \ uint16x8_t _a = vreinterpretq_u16_m128i(a); \ uint16x8_t _b = vreinterpretq_u16_m128i(b); \ ```

@xavierh Okay, could you switch the diff file in the commit then, please?

@xavierh Okay, could you switch the diff file in the commit then, please?
Xavier Hallade added 1 commit 2024-06-04 10:47:16 +02:00
Author
Member
@Anthony-Roberts done!

Thanks! I have kicked off a deps build (full rebuild sadly, as I managed to break something) - I should know if this has resolved issues tomorrow, or maybe later this evening.

Thanks! I have kicked off a deps build (full rebuild sadly, as I managed to break something) - I should know if this has resolved issues tomorrow, or maybe later this evening.
Campbell Barton approved these changes 2024-06-04 12:50:05 +02:00
Anthony Roberts approved these changes 2024-06-04 18:07:26 +02:00
Anthony Roberts left a comment
Member

With the diff change, this builds and tests pass on Windows ARM64

With the diff change, this builds and tests pass on Windows ARM64
Xavier Hallade merged commit 8fa578dcc2 into main 2024-06-04 18:26:25 +02:00
Xavier Hallade deleted branch sycl_update 2024-06-04 18:26:28 +02:00
Author
Member

gitea didn't link all the commits during rebase+merge, here is the list:

  1. d690b08c1f - DPC++ upgrade
  2. 8fa578dcc2 - Embree upgrade
  3. d8b3f852b9 - sse2neon Embree upgrade
gitea didn't link all the commits during rebase+merge, here is the list: 1. https://projects.blender.org/blender/blender/commit/d690b08c1f9034cd7230166affc49650d86c42a2 - DPC++ upgrade 2. https://projects.blender.org/blender/blender/commit/8fa578dcc2a0587cd5f148055d9b7c039f4e919f - Embree upgrade 3. https://projects.blender.org/blender/blender/commit/d8b3f852b9fe25a1cfe126cb6bbcbc7d254110ed - sse2neon Embree upgrade
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No Assignees
5 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#122242
No description provided.