Commit Graph

20 Commits

Author SHA1 Message Date
0456223cde Fix T87793: Cycles OptiX crash hiding objects in viewport render 2021-05-19 18:30:43 +02:00
3df90de6c2 Cycles: Add NanoVDB support for rendering volumes
NanoVDB is a platform-independent sparse volume data structure that makes it possible to
use OpenVDB volumes on the GPU. This patch uses it for volume rendering in Cycles,
replacing the previous usage of dense 3D textures.

Since it has a big impact on memory usage and performance and changes the OpenVDB
branch used for the rest of Blender as well, this is not enabled by default yet, which will
happen only after 2.82 was branched off. To enable it, build both dependencies and Blender
itself with the "WITH_NANOVDB" CMake option.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D8794
2020-10-05 15:03:30 +02:00
9f7d84b656 Cycles: Add support for P2P memory distribution (e.g. via NVLink)
This change modifies the multi-device implementation to support memory distribution
across devices, to reduce the overall memory footprint of large scenes and allow scenes to
fit entirely into combined GPU memory that previously had to fall back to host memory.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D7426
2020-06-08 17:55:49 +02:00
2d1cce8331 Cleanup: make format after SortedIncludes change 2020-03-19 09:33:58 +01:00
472534d16e Fix memory leak in recent Cycles image texture refactor 2020-03-12 20:30:49 +01:00
26bea849cf Cleanup: add device_texture for images, distinct from other global memory
There was too much image texture specific stuff in device_memory, and too
much code duplication between devices.
2020-03-12 17:28:55 +01:00
f01bc597a8 Cleanup: stop encoding image data type in slot index
This is legacy code from when we had a fixed number of textures.
2020-03-11 17:07:17 +01:00
baeb11826b Cycles: Add OptiX acceleration structure compaction
This adds compaction support for OptiX acceleration structures, which reduces the device memory footprint in a post step after building. Depending on the scene this can reduce the amount of used device memory quite a bit and even improve performance (smaller acceleration structure improves cache usage). It's only enabled for background renders to make acceleration structure builds fast in viewport.

Also fixes a bug in the memory management for OptiX acceleration structures: These were held in a dynamic vector of 'device_memory' instances and used the mem_alloc/mem_free functions. However, those keep track of memory instances in the 'cuda_mem_map' via pointers to 'device_memory' (which works fine everywhere else since those are never copied/moved). But in the case of the vector, it may decide to reallocate at some point, which invalidates those pointers and would result in some nasty accesses to invalid memory. So it is not actually safe to move a 'device_memory' object and therefore this removes the move operator overloads again.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D6369
2019-12-09 14:32:12 +01:00
Ha Hyung-jin
9a9e93e804 Fix T71071: errors when using multiple CUDA/Optix GPUs and host mapped memory
The multi device code did not correctly handle cases where some GPUs store a
resource in device memory and others store it in host mapped memory.

Differential Revision: https://developer.blender.org/D6126
2019-11-05 16:40:55 +01:00
c32377da98 Cycles: support move semantics for device_memory
Ref D5363
2019-08-26 17:42:52 +02:00
e12c08e8d1 ClangFormat: apply to source, most of intern
Apply clang format as proposed in T53211.

For details on usage and instructions for migrating branches
without conflicts, see:

https://wiki.blender.org/wiki/Tools/ClangFormat
2019-04-17 06:21:24 +02:00
1daa20ad9f Cleanup: strip trailing space for cycles 2018-07-06 10:17:58 +02:00
f1525cf534 Cycles Denoising: Correctly handle target buffer in tile unmapping and move device swap logic to the device_memory 2018-07-04 14:37:55 +02:00
0fe41009f0 Fix T53830: Cycles OpenCL debug assert on macOS,
This was probably harmless besides some unnecessary memory usage due to
aligning allocations too much.
2018-01-19 11:35:07 +01:00
c621832d3d Cycles: CUDA support for rendering scenes that don't fit on GPU.
In that case it can now fall back to CPU memory, at the cost of reduced
performance. For scenes that fit in GPU memory, this commit should not
cause any noticeable slowdowns.

We don't use all physical system RAM, since that can cause OS instability.
We leave at least half of system RAM or 4GB to other software, whichever
is smaller.

For image textures in host memory, performance was maybe 20-30% slower
in our tests (although this is highly hardware and scene dependent). Once
other type of data doesn't fit on the GPU, performance can be e.g. 10x
slower, and at that point it's probably better to just render on the CPU.

Differential Revision: https://developer.blender.org/D2056
2018-01-02 23:50:18 +01:00
6699454fb6 Cycles: make CUDA code a bit more robust to host/device alloc failures.
Fixes a few corner cases found while stress testing host mapped memory.
2018-01-02 23:46:19 +01:00
5801ef71e4 Code refactor: device memory cleanups, preparing for mapped host memory. 2017-11-05 15:22:04 +01:00
34fe3f9c06 Code refactor: remove MEM_WRITE_ONLY, always use MEM_READ_WRITE.
It's unlikely the driver can do useful optimizations with this, and if
we sum multiple samples we are reading from the memory anyway.
2017-10-24 23:53:09 +02:00
070a668d04 Code refactor: move more memory allocation logic into device API.
* Remove tex_* and pixels_* functions, replace by mem_*.
* Add MEM_TEXTURE and MEM_PIXELS as memory types recognized by devices.
* No longer create device_memory and call mem_* directly, always go
  through device_only_memory, device_vector and device_pixels.
2017-10-24 01:25:19 +02:00
7ad9333fad Code refactor: store device/interp/extension/type in each device_memory. 2017-10-24 01:03:59 +02:00