Cycles: Tweak scheduling of GPU kernel compilation #129945
No reviewers
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Code Documentation
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#129945
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "Sergey/blender:cycles_kernel"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This change makes it so only kernels of the same vendor are compiled in
parallel. For example for the release builds it will be:
This potentially leads to a lower CPU utilization, but it makes it much
easier to manage memory usage and tweak per-vendor concurrency.
The goal of this change is to solve occasional out-of-memory during the
GPU kernels compilation step on the CI/CD farm.
This change also includes tweaks to the prallel jobs for HIP-RT and
oneAPI. The tweak is based on measuring apparent memory usage peak on
Linux when doing single-thread compilation, and giving some safe margin
from the available memory on the buildbot.
@blender-bot package
Package build started. Download here when ready.
@Sergey, if we are now going to compile vendor kernels in stages, then it makes sense to adjust
SYCL_OFFLINE_COMPILER_PARALLEL_JOBS
in the same way like you have done for the HIP amount of parallel jobs. In fact, I am a bit surprised that I do not see a change ofSYCL_OFFLINE_COMPILER_PARALLEL_JOBS
value inbuildbot/config
- does it imply that all this time Blender was using 1 thread for GPU binary compilation of oneAPI kernels? And if so, then why was it a thing, because I believe back there we had enabled 2 threads at least at some moment, haven't we?@Sirgienko This is a bit confusing situation. The oneAPI kernels are still (historically) disabled for PR compilation. This is somewhere next in our TODO list w.r.t CI/CD. That's why you wouldn't see certain things in the logs of this build.
We do set parallel jobs to 2, but we do it via the buildbot itself: https://projects.blender.org/infrastructure/blender-devops/src/branch/main/buildbot/worker/blender/compile.py#L229
It should be possible to set the
SYCL_OFFLINE_COMPILER_PARALLEL_JOBS
from the code now when we haveblender_{linux,windows,macos}
. Initially I wanted to keep this part of the change to a separate PR.The timing seems to be so that this PR compiles CUDA+OptiX+HIP in the similar time as the nightly builds compile
CUDA+OptiX+HIP+oneAPI
.@Sirgienko Assuming we have 24-30 gig of RAM available for the build process, do you think we can raise
SYCL_OFFLINE_COMPILER_PARALLEL_JOBS
to something like 4? Or, maybe, even higher?I'll also do some tests locally to see if the per-GPU-thread-compilation still requires 6gig: https://projects.blender.org/infrastructure/blender-devops/src/branch/main/buildbot/worker/blender/compile.py#L485 Maybe we can tweak that and run more compilation commands in parallel.
Well, up to you how to split this work in PRs then - I have just noticed increase in HIP parallelization but not for oneAPI, and I was wondering why.
Yes, I think this is a good start point for such amount of memories. And then we could tweak it to 3 or 5, depending on the results of this first runs.
I've done some graphs building GPU kernels on a single core to get a feeling of time and memory usage.
CUDA:
OptiX:
HIP:
oneAPI:
We have about 26-28 GiB available memory on Windows workers (the rest of 32gig is the OS etc).
Currently the math works out that compilation happens in 5 threads. We should be able to safely raise it to 8 (to have some margin?), or, even 10.
Interestingly enough the single core compilation outside of VM feels like taking the same amount of time as it takes on a VM on the similar HW. Maybe this is something our Proxmox/IT team can investigate eventually.
For the steps forward I'd add
SYCL_OFFLINE_COMPILER_PARALLEL_JOBS
setting in this PR and set it to 6 (just safe-sounding number for now, we can tweak it later).