Cycles: Use default CUDA context instead of creating a new one #117230

Stefan Werner · 2024-01-17T14:33:08+01:00

Stefan Werner commented

2024-01-17 14:33:08 +01:00

This allows for Cycles and OIDN to share the same context.

Stefan Werner added 1 commit 2024-01-17 14:33:19 +01:00

8451ecd1a3 Cycles: Use default CUDA context instead of creating a new one

This allow for Cycles and OIDN to share the same context.

Stefan Werner commented

2024-01-17 14:34:30 +01:00

Note: CU_CTX_MAP_HOST was deprecated in CUDA 11. Trying to set it on the default context throws an error.

Note: `CU_CTX_MAP_HOST` was deprecated in CUDA 11. Trying to set it on the default context throws an error.

Stefan Werner changed title from ~~Cycles: Use default CUDA context instead of creating a new one~~ to WIP: Cycles: Use default CUDA context instead of creating a new one

2024-01-18 15:32:59 +01:00

Stefan Werner requested review from Brecht Van Lommel 2024-01-18 15:33:20 +01:00

Stefan Werner requested review from Patrick Mours 2024-01-18 15:33:21 +01:00

Stefan Werner self-assigned this 2024-01-18 15:33:28 +01:00

Stefan Werner changed title from ~~WIP: Cycles: Use default CUDA context instead of creating a new one~~ to Cycles: Use default CUDA context instead of creating a new one

2024-01-18 15:33:59 +01:00

Brecht Van Lommel requested changes 2024-01-18 17:02:43 +01:00

Brecht Van Lommel left a comment

Thanks for working on this.

This fails for me with the following steps:

Open classroom.blend
Enable GPU rendering
Split 3D viewport in two
Enable rendered shading mode in both
Error: Failed to configure CUDA context (Primary context active) in the second viewport

I got it with the default cube too, but it's not as reliable to reproduce.

I guess we need some global lock around cuDevicePrimaryCtxRetain and cuCtxPopCurrent in the constructor, and in CUDAContextScope?

And then see how that affects performance for 3D viewport + material preview renders, and multiple 3D viewports.

Thanks for working on this. This fails for me with the following steps: * Open classroom.blend * Enable GPU rendering * Split 3D viewport in two * Enable rendered shading mode in both * Error: `Failed to configure CUDA context (Primary context active)` in the second viewport I got it with the default cube too, but it's not as reliable to reproduce. I guess we need some global lock around `cuDevicePrimaryCtxRetain` and `cuCtxPopCurrent` in the constructor, and in `CUDAContextScope`? And then see how that affects performance for 3D viewport + material preview renders, and multiple 3D viewports.

Stefan Werner added 2 commits 2024-01-23 09:35:03 +01:00

d97d7405ff Merge branch 'main' into cuda_default_ctx

3e92fdf965 Cycles: Configuring primary CUDA context only once.

Stefan Werner commented

2024-01-23 09:37:37 +01:00

I went for an approach without locks:

It first checks if the context is already active, and will only configure an inactive context
In the rare case that two threads still try to configure the same context (race condition), it will catch the respective error and not fail.

This is under the assumption that cu* calls operating on the primary context are thread safe.

I went for an approach without locks: 1) It first checks if the context is already active, and will only configure an inactive context 2) In the rare case that two threads still try to configure the same context (race condition), it will catch the respective error and not fail. This is under the assumption that cu* calls operating on the primary context are thread safe.

Stefan Werner requested review from Brecht Van Lommel 2024-01-23 09:41:51 +01:00

Brecht Van Lommel requested changes 2024-01-23 14:15:35 +01:00

Brecht Van Lommel left a comment

Seems stable in testing now. Just one minor thing.

intern/cycles/device/cuda/device_impl.cpp

						
				@ -106,2 +117,3 @@

				  /* Create context. */

				  result = cuCtxCreate(&cuContext, ctx_flags, cuDevice);

				  result = cuDevicePrimaryCtxRetain(&cuContext, cuDevice);

Brecht Van Lommel commented

2024-01-23 14:00:53 +01:00

Technically speaking there is still a race condition here, in the unlikely event two CUDA devices are created at the same time.

So I would still suggest to add this:

{
  static thread_mutex primary_ctx_init_mutex;
  thread_scoped_lock lock(primary_ctx_init_mutex);

  int active = 0;
  ...
  result = cuDevicePrimaryCtxRetain(&cuContext, cuDevice);
}

Technically speaking there is still a race condition here, in the unlikely event two CUDA devices are created at the same time. So I would still suggest to add this: ``` { static thread_mutex primary_ctx_init_mutex; thread_scoped_lock lock(primary_ctx_init_mutex); int active = 0; ... result = cuDevicePrimaryCtxRetain(&cuContext, cuDevice); } ```

Patrick Mours commented

2024-01-23 15:05:37 +01:00

cuDevicePrimaryCtxRetain and cuDevicePrimaryCtxGetState are thread-safe, per device (the driver has a mutex per device which is held during the call). So should still be safe even if two different CUDA devices were created and it would get the primary context for each at the same time here (since the primary context is also per device), without needing another lock.

`cuDevicePrimaryCtxRetain` and `cuDevicePrimaryCtxGetState` are thread-safe, per device (the driver has a mutex per device which is held during the call). So should still be safe even if two different CUDA devices were created and it would get the primary context for each at the same time here (since the primary context is also per device), without needing another lock.

Brecht Van Lommel commented

2024-01-23 15:11:11 +01:00

I wasn't clear, I meant a race condition when two Cycles CUDADevice instances are created at the same time, for the same GPU.

Imagine:

THREAD A: cuDevicePrimaryCtxGetState (active = 0)
THREAD B: cuDevicePrimaryCtxGetState (active = 0)
THREAD A: cuDevicePrimaryCtxSetFlags
THREAD A: cuDevicePrimaryCtxRetain
THREAD B: cuDevicePrimaryCtxSetFlags -> Failed to configure CUDA context (Primary context active)

I wasn't clear, I meant a race condition when two Cycles `CUDADevice` instances are created at the same time, for the same GPU. Imagine: ``` THREAD A: cuDevicePrimaryCtxGetState (active = 0) THREAD B: cuDevicePrimaryCtxGetState (active = 0) THREAD A: cuDevicePrimaryCtxSetFlags THREAD A: cuDevicePrimaryCtxRetain THREAD B: cuDevicePrimaryCtxSetFlags -> Failed to configure CUDA context (Primary context active) ```

Patrick Mours commented

2024-01-23 15:14:34 +01:00

But shouldn't that be handled by the second condition in if (result != CUDA_SUCCESS && result != CUDA_ERROR_PRIMARY_CONTEXT_ACTIVE)?

But shouldn't that be handled by the second condition in `if (result != CUDA_SUCCESS && result != CUDA_ERROR_PRIMARY_CONTEXT_ACTIVE)`?

Brecht Van Lommel approved these changes 2024-01-23 15:15:57 +01:00

Brecht Van Lommel left a comment

Ah yes, I missed that.

Stefan Werner merged commit 4f58cffb4e into main

2024-01-23 15:31:55 +01:00

Stefan Werner referenced this issue from a commit

2024-01-23 15:31:55 +01:00

Cycles: Use default CUDA context instead of creating a new one

Stefan Werner deleted branch cuda_default_ctx

2024-01-23 15:31:57 +01:00

Jonas Dichelle referenced this issue from a commit

2024-02-08 17:29:04 +01:00

Cycles: Use default CUDA context instead of creating a new one

Sign in to join this conversation.

No reviewers

No Label

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

Cycles: Use default CUDA context instead of creating a new one #117230