Commit Graph

284 Commits

Author SHA1 Message Date
63c0653170 Merge branch 'master' into blender2.8 2018-11-29 23:54:30 +01:00
a8b8da5567 Fix T58183: crash with CPU + GPU rendering after profiling changes.
Multi-device was not passing along profiler to the CPU.
2018-11-29 23:43:27 +01:00
78a6689aea Merge branch 'master' into blender2.8 2018-11-09 14:34:33 +01:00
203de0bbf0 Cycles: Cleanup, space after (void)
It was used in like 95% of places.
2018-11-09 12:08:51 +01:00
2330cadb0f Cycles: Cleanup, don't use strict C prototypes
Those are more like a legacy of language, which is not
needed in C++.
2018-11-09 12:04:41 +01:00
cb4b5e12ab Cycles: Cleanup, spacing after preprocessor
It is supposed to be two spaces before comment stating which if
else/endif statements corresponds to. Was mainly violated in the
header guards.
2018-11-09 11:34:54 +01:00
fc12a736bb Merge branch 'master' into blender2.8 2018-10-31 11:49:04 +01:00
e0cc3e9809 Cycles: Fix wrong BVH used when disabling AVX2 in debug settings
Mainly useful for debugging. Previously, when AVX2 was disabled
in the debug panel but BVH layout was kept on BVH8 nothing was
rendered.

Needed to make it so supported BVH layout mask for devices is
queried in "dynamic", so it is possible to use DebugFlags there.
2018-10-31 11:46:52 +01:00
733e6c0b1d Merge branch 'master' into blender2.8 2018-10-09 08:46:00 +11:00
15e9d80375 Cycles: Use existing shared temporary memory in reconstruction step of the denoiser
Previously the code allocated its own temporary memory, but it's possible to just use the existing shared one instead.
2018-10-08 22:13:40 +02:00
a53c81c60b Merge branch 'master' into blender2.8 2018-09-19 18:42:17 +02:00
a5101e4da8 Cycles: Cleanup, double semicolon 2018-09-19 18:41:43 +02:00
871b7ba892 Merge branch 'master' into blender2.8 2018-08-28 19:15:08 +02:00
94efc651d4 Cycles Denoiser: Allocate a single temporary buffer for the entire denoising process
With small tiles, the repeated allocations on GPUs can actually slow down the denoising quite a lot.
Allocating the buffer just once reduces rendertime for the default cube with 16x16 tiles and denoising on a mobile 1050 from 22.7sec to 14.0sec.
2018-08-25 12:23:52 -07:00
8dff538989 Merge branch 'master' into blender2.8 2018-07-05 22:46:04 +02:00
Stefan Werner
4d00e95ee3 Cycles: Adding native support for UINT16 textures.
Textures in 16 bit integer format are sometimes used for displacement, bump and normal maps and can be exported by tools like Substance Painter. Without this patch, Cycles would promote those textures to single precision floating point, causing them to take up twice as much memory as needed.

Reviewers: #cycles, brecht, sergey

Reviewed By: #cycles, brecht, sergey

Subscribers: sergey, dingto, #cycles

Tags: #cycles

Differential Revision: https://developer.blender.org/D3523
2018-07-05 13:53:34 +02:00
49b86bcfec Merge branch 'master' into blender2.8 2018-07-05 07:54:47 +02:00
c960804747 Cycles Denoising: Pass tile buffers to every OpenCL kernel to conform to standard and get rid of set_tile_info 2018-07-04 14:38:03 +02:00
9db8bdbc65 Cycles Denoising: Cleanup: Rename tiles to tile_info 2018-07-04 14:37:24 +02:00
97a0d6fcc7 Cycles Denoising: Refactor denoiser tile handling
This deduplicates the calls for tile (un)mapping and allows to have a target buffer that is different from the source buffer (needed for baking and animation denoising).
2018-07-04 14:36:01 +02:00
b10c64bd2f Cycles Denoising: Split main function into logical steps 2018-07-04 14:35:05 +02:00
c054a1a848 Merge branch 'master' into blender2.8 2018-06-21 15:02:38 +02:00
a283333cd8 Fix Cycles CUDA render errors with CUDA 9.2.
Work around what might be a compiler bug.
2018-06-21 12:32:32 +02:00
c98b2e74df Merge branch 'master' into blender2.8
Conflicts:
	source/blender/editors/object/object_add.c
	source/blender/editors/object/object_relations.c
2018-06-12 12:38:54 +02:00
7bf4023689 Fix T55448: Typo in Cycles CUDA debug output
Reviewers: sergey, lukasstockner97

Reviewed By: lukasstockner97

Tags: #cycles, #bf_blender

Differential Revision: https://developer.blender.org/D3472
2018-06-12 10:45:32 +02:00
d09920687c Merge branch 'master' into blender2.8 2018-05-02 12:46:14 +02:00
16c05161e7 Cycles: Cleanup: Remove double semicolons 2018-04-29 09:28:41 +02:00
9ff8195535 OpenGL: Remove remaining instances of GL_RGBA16F_ARB.
There is no need for it now that we use opengl 3.3. Use GL_RGBA16F instead.
2018-04-24 12:48:43 +02:00
2bc952fdb6 Merge branch 'master' into blender2.8 2018-02-18 22:33:05 +11:00
fee4b646c4 Cycles: tweak CUDA messages and avoid build errors with existing sm_2x configs. 2018-02-18 00:53:25 +01:00
1dcd7db73d Code cleanup: remove some more unused code after recent CUDA changes. 2018-02-18 00:53:03 +01:00
9e717c0495 Cycles: Remove Fermi texture code.
This should be the last Fermi removal commit, unless I missed something.
It's been a pleasure Fermi!
2018-02-17 22:56:58 +01:00
2eaf90b305 Cycles: Remove Fermi support from CMake and update runtime checks in device_cuda.cpp.
Fermi code in Cycles kernel and texture system are coming next.
2018-02-17 16:15:07 +01:00
ade2aaba09 Merge branch 'master' into blender2.8 2018-02-07 17:17:24 +01:00
1dafe759ed Update CUEW to latest version
This brings separate initialization for libcuda and libnvrtc, which
fixes Cycles nvrtc compilation not working on build machines without
CUDA hardware available.

Differential Revision: https://developer.blender.org/D3045
2018-02-07 11:53:01 +01:00
eeb621566a Merge branch 'master' into blender2.8 2018-02-04 10:46:34 +11:00
a5052770b8 cycles: Add an nvrtc based cubin cli compiler.
nvcc is very picky regarding compiler versions, severely limiting the compiler we can use, this commit adds a nvrtc based compiler that'll allow us to build the cubins even if the host compiler is unsupported. for details see D2913.

Differential Revision: http://developer.blender.org/D2913
2018-02-03 10:59:09 -07:00
fc1fd2704a Merge branch 'master' into blender2.8 2018-01-23 11:45:39 +11:00
2f79d1c058 Cycles: Replace use_qbvh boolean flag with an enum-based property
This was we can introduce other types of BVH, for example, wider ones, without
causing too much mess around boolean flags.

Thoughs:

- Ideally device info should probably return bitflag of what BVH types it
  supports.

  It is possible to implement based on simple logic in device/ and mesh.cpp,
  rest of the changes will stay the same.

- Not happy with workarounds in util_debug and duplicated enum in kernel.
  Maybe enbum should be stores in kernel, but then it's kind of weird to include
  kernel types from utils. Soudns some cyclkic dependency.

Reviewers: brecht, maxim_d33

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D3011
2018-01-22 17:19:20 +01:00
9c91c75ea6 Merge branch 'master' into blender2.8 2018-01-11 13:24:41 +11:00
d0892a6648 Fix issue with moving CUDA memory to host and multiple devices.
This is not expected to fix all issues. Also adds some more details
to error reporting to investigate failures.
2018-01-11 00:00:48 +01:00
be40389165 Merge branch 'master' into blender2.8 2018-01-03 23:44:47 +11:00
c621832d3d Cycles: CUDA support for rendering scenes that don't fit on GPU.
In that case it can now fall back to CPU memory, at the cost of reduced
performance. For scenes that fit in GPU memory, this commit should not
cause any noticeable slowdowns.

We don't use all physical system RAM, since that can cause OS instability.
We leave at least half of system RAM or 4GB to other software, whichever
is smaller.

For image textures in host memory, performance was maybe 20-30% slower
in our tests (although this is highly hardware and scene dependent). Once
other type of data doesn't fit on the GPU, performance can be e.g. 10x
slower, and at that point it's probably better to just render on the CPU.

Differential Revision: https://developer.blender.org/D2056
2018-01-02 23:50:18 +01:00
6699454fb6 Cycles: make CUDA code a bit more robust to host/device alloc failures.
Fixes a few corner cases found while stress testing host mapped memory.
2018-01-02 23:46:19 +01:00
9f0d067c2e Merge branch 'master' into blender2.8 2017-12-21 11:17:34 +01:00
5650fe77e4 Cycles: Cleanup, indentation 2017-12-20 17:42:50 +01:00
03a5eccc94 Merge branch 'master' into blender2.8 2017-11-30 18:30:41 +11:00
fa3d50af95 Cycles: Improve denoising speed on GPUs with small tile sizes
Previously, the NLM kernels would be launched once per offset with one thread per pixel.
However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown.

Therefore, the kernels are now launched in a single call that handles all offsets at once.
This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory.
On the other hand, of course, the smaller tiles significantly reduce the size of the memory.

The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum.
I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere.

To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now.
Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.
2017-11-30 07:37:08 +01:00
d992240bfa Fix unneeded legacy OpenGL call in Cycles viewport drawing. 2017-11-24 00:12:48 +01:00
Julian Eisel
7f96323cd0 Merge branch 'master' into blender2.8 2017-11-19 13:16:14 +01:00