Commit Graph

1490 Commits

Author SHA1 Message Date
cd809b95d8 Cycles: Add AttributeDescriptor
Adds a descriptor for attributes that can easily be passed around and extended
to contain more data. Will be used for attributes on subdivision meshes.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2110
2016-08-05 23:49:21 -04:00
c5eb400b7c Cycles: Fix embarrassing typo
Spotted by Mai Lavelle, thanks!
2016-08-05 14:45:54 +02:00
61d7289023 Cycles: Correction to previous commit
The change didn't fix difference render result on CUDA as i've hoped,
so reverting change for GPU rendering for now.

Sorry for the noise.
2016-08-05 12:16:24 +02:00
470cc98945 Cycles: Fix/workaround for wrong/noise render results with GCC6 2016-08-05 11:56:20 +02:00
99b1c1018a Cycles: Recent SSS inline changes broke CPU tests
Very weird, but let's just fall back a bit for now.
2016-08-03 15:27:48 +02:00
960db4c961 Cycles: Revert recent inline changes for CUDA 8 and sm_50+
This changes actually lead to 2x slowdown. It's getting a bit annoying
because those are the changes to make pre-maxwell cards render with the
same speed.
2016-08-03 11:41:58 +02:00
41a4967b30 Fix T49003: Cycles volumes have wrong results after recent microdisp commits
Problem was that sd->prim can be -1 for volumes and was causing check in subd
code to access out of bounds
2016-08-02 15:28:07 -04:00
08ebd72851 Buildbot: Use annoying hybrid setup of two CUDA toolkits
This is for until we'll solve issues with toolkit 8.0.
2016-08-02 15:32:03 +02:00
500e0e9a3d Cycles: Some more inline policy tweaks for CUDA 8
Makes it so toolkit does exactly the same decision about what to inline,
but unfortunately it has really barely visible difference on GTX-980.
2016-08-02 15:13:34 +02:00
6353ecb996 Cycles: Tweaks to support CUDA 8 toolkit
All the changes are mainly giving explicit tips on inlining functions,
so they match how inlining worked with previous toolkit.

This make kernel compiled by CUDA 8 render in average with same speed
as previous kernels. Some scenes are somewhat faster, some of them are
somewhat slower. But slowdown is within 1% so far.

On a positive side it allows us to enable newer generation cards on
buildbots (so GTX 10x0 will be officially supported soon).
2016-08-01 15:54:29 +02:00
710ab5be36 Cleanup: spelling, style 2016-07-31 17:41:05 +10:00
9b6ed3a42b Cycles: refactor kernel closure storage to use structs per closure type.
Reviewed By: dingto, sergey

Differential Revision: https://developer.blender.org/D2127
2016-07-31 02:34:43 +02:00
ea2ebf7a00 Cycles: constant folding for RGB/Vector Curves and Color Ramp.
These are complex nodes, and it's conceivable they may end up constant
in some circumstances within node groups, so folding support is useful.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2084
2016-07-31 02:18:23 +02:00
34a639bd0f Fix CUDA warning, due to extra ; at the line ending. 2016-07-30 21:37:20 +02:00
6dc72b3ce6 Cycles OpenCL: detect incorrect usage of SOA members in the split kernel. 2016-07-30 18:25:52 +02:00
c937a42c61 Fix Cycles OpenCL address space compile error with amdgpu-pro drivers on Linux. 2016-07-30 18:25:17 +02:00
c96ae81160 Cycles microdisplacement: ngons and attributes for subdivision meshes
This adds support for ngons and attributes on subdivision meshes. Ngons are
needed for proper attribute interpolation as well as correct Catmull-Clark
subdivision. Several changes are made to achieve this:

- new primitive `SubdFace` added to `Mesh`
- 3 more textures are used to store info on patches from subd meshes
- Blender export uses loop interface instead of tessface for subd meshes
- `Attribute` class is updated with a simplified way to pass primitive counts
  around and to support ngons.
- extra points for ngons are generated for O(1) attribute interpolation
- curves are temporally disabled on subd meshes to avoid various bugs with
  implementation
- old unneeded code is removed from `subd/`
- various fixes and improvements

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2108
2016-07-29 03:36:30 -04:00
e0c7aaf5ad Fix Cycles OSL hair BSDF inconsistencies with SVM. 2016-07-29 03:29:05 +02:00
d834759423 Cycles: Fix difference in Ashikhmin Shirley shader between CPU and GPU
The issue was caused by some NaN appearing in calculations.

Visible with scifi_armor_concept.blend from the cloud.
2016-07-28 18:46:29 +02:00
f31f740bd0 Cycles: Proper fix for buffer overflow in volume intersect all 2016-07-26 17:16:23 +02:00
7030794171 Cycles: Revert previous fixes to intersect_all functions
While they prevent legit write past the array boundary error
those fixes introduced regression in behavior when having exact
max_hits transparent intersections and nothing else.

Previous code would have considered such case a totally opaque,
but it's not correct.

Fixes T48941: Some materials don't get transparent shadows anymore
2016-07-26 17:16:23 +02:00
d9cc3ea2c6 Cycles: Fix rays parallel to the surface in the triangle refine and MultiGGX code
In the triangle intersection refinement code, rays that are parallel to the triangle caused a divide by zero.
These rays might initially hit the triangle due to the watertight intersection test, but are very rare - therefore, just skipping the refinement for them works fine.

Also, a few remaining issues in the MultiGGX code are fixed that were caused by rays parallel to the surface (which happened more often there due to smooth shading).
2016-07-25 16:14:25 +02:00
83ae0a0e06 Cycles: Calculate differentials in the Multiscattering GGX closures
The Multiscattering GGX closures didn't set the omega_i differentials, which could cause undefined behaviour.
2016-07-25 16:14:25 +02:00
e7721f5ec8 Cycles: Fix SSS with spatial splits and motion blur 2016-07-25 13:55:03 +02:00
f23fecf306 Fix use of uninitialized variable in recent SSS fix. 2016-07-24 16:40:30 +02:00
20ec6bc166 Fix Cycles kernel build without render passes support. 2016-07-18 22:40:08 +02:00
9946cca146 Fix T48860: Cycles SSS artifacts with spatially split BVH
The issue was caused by SSS intersection code gathering all
intersections without check for duplicated ones. This caused
situations when same intersection will be recorded twice in
the case if triangle is shared by several BVH nodes.

Usually this is handled by checking intersection distance
after sorting intersections (in shadow_blocked for example)
but for SSS we don't do such sorting and using number of
intersections to calculate various things.

Didn't find anything smarter than to check intersection
distance in triangle_intersect_subsurface().

This solves render artifacts in the cost of 1.5% slowdown
of extreme case rendering (SSS object filling in whole
FullHD screen).

Reviewers: brecht

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2105
2016-07-18 10:04:20 +02:00
a2c82f5e5d Cycles: Fix OpenCL compilation after the recent numerical fixes 2016-07-17 19:24:53 +02:00
d9281a6332 Cycles: Fix three numerical issues in the fresnel, normal map and Beckmann code
- In fresnel_dielectric, the differentials calculation sometimes divided by zero.
- When the normal map was (0.5, 0.5, 0.5), the code would try to normalize a zero vector. Now, it just uses the regular normal as a fallback.
- The approximate error function used in Beckmann sampling sometimes overflowed to inf while calculating r^16. The final value is 1 - 1/r^16, however,
  so now it just returns 1 if the computation would overflow otherwise.
2016-07-16 20:54:14 +02:00
5ba78d76d4 Cycles: Deduplicate geometric factor calculation in the Beckmann distribution
Also, this fixes a numerical issue where A would be inf.
Since later G is set to 1 if A is larger than 1.6, the code now checks the reciprocal of A for being smaller than 1/1.6 - same effect, but no inf involved.
2016-07-16 20:54:14 +02:00
3637cbbcf8 Cycles: Fix wrong termination criteria in intersect_all functions
It was possible to miss bounces termination criteria in this functions,
mainly when max_hits was set to 0.

Made the check more robust in traversal functions (which should not
affect performance, it's an operation of same complexity AFAIK).

Also avoid doing ray-scene intersection from shadow_blocked when
limit of transparent bounces was already reached.
2016-07-14 11:26:20 +02:00
c06d3b6c36 Cycles: Fix compilation error on Windows with OSL enabled
Seems there's some conflict around `near` identifier in that configuration.
2016-07-11 18:15:51 +02:00
7602b6bf62 Cycles: Fix typo 2016-07-11 18:01:40 +02:00
ea32a03801 Fix T48824: Crash when having too many ray-to-volume intersections
Code might have writing past the array boundaries.
2016-07-11 17:59:46 +02:00
b99f7a9b2a Cycles: Fix Extend image extension mode on OpenCL 2016-07-11 14:46:42 +02:00
cb3b19730c Cycles: Use utility define for restrict pointers
This way restrict can be used for CUDA and OpenCL as well.

From quick tests in areas i've been testing this it might give some
barely measurable %% of speedup, but it increases registers pressure.

So use of this qualifier is still really limited.
2016-07-11 13:58:47 +02:00
cf82b49a0f Cycles: Cleanup, variables name
Using camel case for variables is something what didn't came from our original
code, but rather from third party libraries. Let's avoid those as much as possible.
2016-07-11 13:58:47 +02:00
2ecbc3b777 Cycles: Add _all suffix to shadow traversal file
Matches better naming of volume traversal files, where we've got
optimized versions of a single step of volume intersection and
traversal which will gather all volume intersections.
2016-07-11 13:58:47 +02:00
4355603790 Cycles: Move BVK kernel files to own directory
BVH traversal is not really that much a geometry and we've got
quite some traversals now. Makes sense to keep them separate in
the name of source structure clarity.
2016-07-11 13:58:47 +02:00
a62967787c Fix T48808: Regression: Cycles OpenCL broken after Hair BVH commit 2016-07-08 09:41:36 +02:00
a08e2179f1 Cycles: Implement unaligned nodes BVH traversal
This commit implements traversal of unaligned BVH nodes.

QBVH traversal is fully SIMD optimized and calculates orientation
for all 4 children at a time, regular BVH might probably be optimized
a bit more.
2016-07-07 17:25:48 +02:00
b03e66e75f Cycles: Implement unaligned nodes BVH builder
This is a special builder type which is allowed to orient nodes to
strands direction, hence minimizing their surface area in comparison
with axis-aligned nodes. Such nodes are much more efficient for hair
rendering.

Implementation of BVH builder is based on Embree, and generally idea
there is to calculate axis-aligned SAH and oriented SAH and if SAH
of oriented node is smaller than axis-aligned SAH we create unaligned
node.

We store both aligned and unaligned nodes in the same tree (which
seems to be different from what Embree is doing) so we don't have
any any extra calculations needed to set up hair ray for BVH
traversal, hence avoiding any possible negative effect of this new
BVH nodes type.

This new builder is currently not in use, still need to make BVH
traversal code aware of unaligned nodes.
2016-07-07 17:25:48 +02:00
1a2012145d Cycles: Switch node address to absolute values in BVH tree
This seems to be straightforward way to support heterogeneous nodes
in the same tree.

There is some penalty related on 4gig limit of the address space now,
but here's are the thing:

Traversal code was already using ints to store final offset, so
there can't be regressions really.

This is a required commit to make it possible to encode both aligned
and unaligned nodes in the same array. Also, in the future we can use
this to get rid of __leaf_nodes array (which is a bit tricky to do since
trickery in pack_instances().
2016-07-07 17:25:48 +02:00
17e7454263 Cycles: Reduce memory usage by de-duplicating triangle storage
There are several internal changes for this:

First idea is to make __tri_verts to behave similar to __tri_storage,
meaning, __tri_verts array now contains all vertices of all triangles
instead of just mesh vertices. This saves some lookup when reading
triangle coordinates in functions like triangle_normal().

In order to make it efficient needed to store global triangle offset
somewhere. So no __tri_vindex.w contains a global triangle index which
can be used to read triangle vertices.

Additionally, the order of vertices in that array is aligned with
primitives from BVH. This is needed to keep cache as much coherent as
possible for BVH traversal. This causes some extra tricks needed to
fill the array in and deal with True Displacement but those trickery
is fully required to prevent noticeable slowdown.

Next idea was to use this __tri_verts instead of __tri_storage in
intersection code. Unfortunately, this is quite tricky to do without
noticeable speed loss. Mainly this loss is caused by extra lookup
happening to access vertex coordinate.

Fortunately, tricks here and there (i,e, some types changes to avoid
casts which are not really coming for free) reduces those losses to
an acceptable level. So now they are within couple of percent only,

On a positive site we've achieved:

- Few percent of memory save with triangle-only scenes. Actual save
  in this case is close to size of all vertices.

  On a more fine-subdivided scenes this benefit might become more
  obvious.

- Huge memory save of hairy scenes. For example, on koro.blend
  there is about 20% memory save. Similar figure for bunny.blend.

This memory save was the main goal of this commit to move forward
with Hair BVH which required more memory per BVH node. So while
this sounds exciting, this memory optimization will become invisible
by upcoming Hair BVH work.

But again on a positive side, we can add an option to NOT use Hair
BVH and then we'll have same-ish render times as we've got currently
but will have this 20% memory benefit on hairy scenes.
2016-07-07 17:25:48 +02:00
1eacbf47e3 Cycles: Support visibility check for inner nodes of QBVH
It was initially unsupported because initial idea of checking visibility
of all children was slowing scenes down a lot. Now the idea has changed
and we only perform visibility check of current node. This avoids huge
slowdown (from tests here it seems to be withing 1-2%, but more tests
would never hurt) and gives nice speedup of ray traversal for complex
scenes which utilized ray visibility.

Here's timing of koro.blend:

                  Without visibility check         With visibility check
Original file           4min 20sec                      4min 23sec
Camera rays only        1min 43 sec                       55sec

Unfortunately, this doesn't come for free and requires extra data in
BVH node, which increases memory usage of BVH nodes by 15%. This we
can solve with some future trickery of avoiding __tri_storage created
for curve segments.
2016-07-07 17:25:48 +02:00
39ae324918 Cycles: remove extended precision hacks, no longer needed with SSE2 requirement.
Differential Revision: https://developer.blender.org/D2079
2016-07-04 18:22:11 +02:00
8cc123a387 Fix T48783: OSL render errors after recent refactoring. 2016-07-03 13:08:21 +02:00
9f5621bb4a Cleanup: comment blocks 2016-07-02 10:08:33 +10:00
5c249fac9a Fix Cycles OpenCL not taking Extend and Clip extension types into account.
(See T48720).
2016-07-01 23:48:31 +02:00
23cc453975 Fix T48732: New GGX breaks OpenCL kernel
Make sure we don't perform any implicit address space conversion.

A bit annoying, but less intrusive approaches (like using temp private
variable in .cl kernel) do not work correct here.

Using generic address space will help from code side here, but will
be somewhat slower due to extra things happening as far as i know.
2016-06-28 17:15:35 +05:00