Commit Graph

7126 Commits

Author SHA1 Message Date
2069102c56 Cycles: Fix constness for load_kernels in device_cpu.cpp 2017-12-06 00:00:18 +01:00
d64d8b5be5 Fix Cycles standalone crash when saving output, after recent refactoring. 2017-12-02 05:45:09 +01:00
28d2148b09 Haiku OS Support
D2860 by @miqlas

Even though Haiku is a niche OS, only minor changes are needed.
2017-11-30 18:05:21 +11:00
fa3d50af95 Cycles: Improve denoising speed on GPUs with small tile sizes
Previously, the NLM kernels would be launched once per offset with one thread per pixel.
However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown.

Therefore, the kernels are now launched in a single call that handles all offsets at once.
This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory.
On the other hand, of course, the smaller tiles significantly reduce the size of the memory.

The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum.
I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere.

To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now.
Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.
2017-11-30 07:37:08 +01:00
e4b54f44c1 Cycles: add object level holdout property.
This works the same as the holdout shader and Z mask layer. Combined with
overrides in 2.8 this is intended to replace the Z mask layer bits.
2017-11-29 18:11:40 +01:00
Maxym Dmytrychenko
7e349f2745 Cycles: improve triangle intersection performance.
Reduces render time by about 1-2% in benchmark scenes.

Differential Revision: https://developer.blender.org/D2911
2017-11-29 18:11:40 +01:00
Mathieu Menuet
83e80db56e Fix T53349: AO bounces not working correct with OpenCL. 2017-11-26 15:53:00 +01:00
cf6e8edda5 atomic_ops: add atomic_cas_float helper. 2017-11-23 21:17:16 +01:00
ff9eab7926 atomic_ops: Copy/adapt static assert macro from BLI_utildefines, and use it.
Checking for type sizes is much nicer with a static assert!
2017-11-23 20:25:55 +01:00
6be95f8778 Fix T53357: harmless assert after recent addition of render time pass. 2017-11-23 17:14:35 +01:00
e50ed90e4d Fix T53348: Cycles difference between gradient texture on CPU and GPU. 2017-11-23 17:14:04 +01:00
e704d8a616 Moar attempt to fix bloody MSVC intrinsic mess... 2017-11-23 16:58:20 +01:00
df06f1c816 Attempt to fix bloody MSVC atomic intrinsic mess... 2017-11-23 16:53:03 +01:00
580b34e52b atomic_ops: add char versions of uint8_t atomic primitives. 2017-11-23 16:24:34 +01:00
105b95835f atomic_ops: add signed versions of primitives.
Reason is motsly that dealing with type conversion in calling code is
not great, makes it less readable, and can generate hidden bugs in case
original type changes and atomic primitive calls are not updated
accordingly...
2017-11-23 16:24:33 +01:00
d77f1d6538 Fix T53313: bevel shader with transmission render artifacts. 2017-11-22 01:59:21 +01:00
Stefan Werner
58a15b2bfe Cycles: Fixed compilation of CUDA kernels. Follow-up fix for my last commit. 2017-11-21 10:43:40 +01:00
d8f80fbe72 Cycles: Fix OSL brick node after recent fix 2017-11-21 04:30:12 -05:00
Stefan Werner
1febc85855 Cycles: Workaround for performance loss with the CUDA 9.0 SDK.
CUDA 9.0.176 apparently caused some slow down on high-end Pascal cards that can be mitigated by increasing the number of registers. See https://developer.blender.org/F1142667 for a detailed comparison.
2017-11-21 10:29:11 +01:00
9325b9bf15 Fix T53365: OpenCL has wrong shading of brick texture
Looks like some weird compiler difference with signed vs unsigned ints.
2017-11-21 00:42:55 -05:00
d089875c4c Fix build with OSL 1.9.x, automatically aligns to 16 bytes now. 2017-11-20 23:24:24 +01:00
51e2844387 Cycles: Fix wrong behavior of sharpness in Cubic SSS
Was giving difference when using sharpness of 1.0 and 0.999 even though the
result was expected to be really close to each other.

This SSS profile will probably be removed in the future in favor of more
physically bases Burley, but for the time being don't see anything wrong
fixing an existing code.
2017-11-20 11:40:55 +01:00
119846a6bb Mikktspace: Speed up the merging of identical vertices
Previously, Mikktspace just bucketed the vertices based on one spatial coordinate and then ran full pairwise comparisons inside each bucket.
However, since models are three-dimensional, the bucketing has a massive false-positive rate, and since pairwise comparison is O(n^2), the merging process is very slow.

But, since we only care about exactly identical vertices, there is a much more efficient approach - we can just hash all values belonging to each vertex and form buckets based on the hash.
Since the hash has 32 bits and considers all values, false-positives are very unlikely - and since both hashing and the radixsort that's used for bucketing are O(n), both asymptotical and
real-world performance (as well as code complexity) are significantly improved.
2017-11-17 18:34:53 +01:00
40f528a7da Cycles: Add per-tile render time debug pass
Reviewers: sergey, brecht

Differential Revision: https://developer.blender.org/D2920
2017-11-17 16:40:24 +01:00
a0c02e4d1b Cycles: Add Volume Direct and Volume Indirect passes for volume-scattered light
No color pass because it's hard to define what to use as color in a volume.

Reviewers: sergey, brecht

Differential Revision: https://developer.blender.org/D2903
2017-11-17 16:39:45 +01:00
f78e963858 Cycles: Refactor PassType from bitflag to index in order to allow for more passes 2017-11-17 16:34:19 +01:00
470b4cb62f Cycles: Fix crash with split branched path tracing
ShaderData memory was getting clobbered in the branched path code paths.

Was caused by 087331c495
2017-11-16 04:59:31 -05:00
67ddc28055 Smoke: Pass non-trivial arguments by const reference 2017-11-14 17:11:48 +01:00
2868dcbe2b Fix compilation error with clang-5 2017-11-14 17:11:48 +01:00
212a8d9e5a Cycles: Make per-object random value output also work for Lamps 2017-11-14 04:17:54 +01:00
d8066fb0f1 Cycles: Refactor closure roughness detection to fix a potential bug with Denoising of specular shaders 2017-11-14 04:17:54 +01:00
d1a761c4d4 Cycles: Fix compilation error of standalone application 2017-11-13 10:49:05 +01:00
42dff6cc2e Cycles: Fix compilation error with OIIO compiled against system PugiXML 2017-11-13 10:42:29 +01:00
e568c1a975 Fix T53289: CUDA missing textures not showing pink, after recent changes. 2017-11-12 20:45:47 +01:00
e389ae9dca Cycles: Set error if a split kernel fails to load
To help catch cases where adding a new kernel is missed for one of the
device implementations.
2017-11-11 01:01:14 -05:00
db7a78a2be Cycles: Fix compilation error with latest OIIO
There was some changes about namespaces, which causes ambiguities.

Replaces using namespace with an explicit symbols we need. Is good idea to NOT
pull in the whole namespace anyway!
2017-11-10 10:04:33 +01:00
a466d7ae24 Cycles: better distance sampling for chromatic volume extinction.
Previously we picked one of the RGB channels with equal probability, but this
works poorly in a dense volume after many bounces. Now we take into account
the throughput and single scattering albedo.

This makes it a little more practical to do brute force SSS with volumes, but
is still very inefficient because we do direct light sampling at every volume
bounce even when inside an opaque mesh. In theory there could be a light inside
the mesh so we can't automatically disable direct lighting.
2017-11-10 01:37:10 +01:00
21a535840d Fix T53270: crash with multiscatter GGX after recent refactoring.
In fact this was an existing issue when exceeding the number of available
closure, but it's more common now that we set the number to 0 for shadows
and emission
2017-11-09 20:28:00 +01:00
1ffa01b6f8 Fix (harmless) valgrind warning. 2017-11-09 20:28:00 +01:00
bd4bea3e98 Cycles: avoid reallocating tile denoising memory many times during render. 2017-11-09 20:28:00 +01:00
Dalai Felinto
08a023d7ca Cycles: Silence warning when building without OSL 2017-11-09 08:39:30 -02:00
087331c495 Cycles: Replace __MAX_CLOSURE__ build option with runtime integrator variable
Goal is to reduce OpenCL kernel recompilations.

Currently viewport renders are still set to use 64 closures as this seems to
be faster and we don't want to cause a performance regression there. Needs
to be investigated.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2775
2017-11-09 01:04:06 -05:00
26f39e6359 Cycles: add bevel shader, for raytrace based rounded edges.
The algorithm averages normals from nearby surfaces. It uses the same
sampling strategy as BSSRDFs, casting rays along the normal and two
orthogonal axes, and combining the samples with MIS.

The main concern here is that we are introducing raytracing inside
shader evaluation, which could be quite bad for GPU performance and
stack memory usage. In practice it doesn't seem so bad though.

Note that using this feature can easily slow down renders 20%, and
that if you care about performance then it's better to use a bevel
modifier. Mainly this is useful for baking, and for cases where the
mesh topology makes it difficult for the bevel modifier to work well.

Differential Revision: https://developer.blender.org/D2803
2017-11-07 22:35:12 +01:00
f79f386731 Code refactor: rename subsurface to local traversal, for reuse. 2017-11-07 22:35:12 +01:00
d0af56fe3b Cycles: antialias normal baking if the mesh has a bump map. 2017-11-07 22:35:12 +01:00
ff34e48911 Cycles: add an extra CUDA synchronize before rendering.
It should not be needed as far as I know, but just in case it fixes any
of the recent issues like T52572.
2017-11-07 22:35:12 +01:00
e74b229342 Fix incorrect MIS weights in Cycles with multiple lights.
This causes some difference in the classroom scene, where ray visibility
tricks are used and break the MIS balance. Otherwise there doesn't seem
to be much effect, but better to use the right formulas. Problem originally
identified by Lukas.
2017-11-07 22:35:12 +01:00
1a1fb5a47c Cycles: Cleanup, style 2017-11-07 13:55:58 +01:00
8a72be7697 Cycles: reduce closure memory usage for emission/shadow shader data.
With a Titan Xp, reduces path trace local memory from 1092MB to 840MB.
Benchmark performance was within 1% with both RX 480 and Titan Xp.

Original patch was implemented by Sergey.

Differential Revision: https://developer.blender.org/D2249
2017-11-05 20:48:33 +01:00
c571be4e05 Code refactor: sum transparent and absorption weights outside closures. 2017-11-05 18:13:44 +01:00