Commit Graph

157 Commits

Author SHA1 Message Date
ec54a08d30 Revert "Cycles: Tweak empty boundbox children"
This reverts commit ecbfa31caa.

Original commit broke logic in nodes re-fitting. That area can
access non-existing children momentarely. Not sure what would
be best solution here, for now simply reverting the change/
2016-09-15 09:39:33 +02:00
ecbfa31caa Cycles: Tweak empty boundbox children
The idea here is to make assert failure to fail sooner on an incorrect
node address rather than later with stack overflow.
2016-09-13 11:05:11 +02:00
52038fd8c7 Fix T49290: Specific .blend with hair crashes in MacOS 2.78 RC1 on render
The issue was caused by some false-positive empty non-AABB intersection.
Tried to tweak it a bit so it does not record intersection anymore.

Hopefully will work for all platforms. Tested here on iMac and Debian.
2016-09-13 10:59:48 +02:00
70e7c0829e Cycles: Deduplicate QBVH node packing across BVH build and refit 2016-09-09 11:32:05 +02:00
6de08f6cd1 Cycles: Fix regular BVH nodes refit
For proper indexing to work we need to use unaligned node with
identity transform instead of aligned nodes when doing refit.

To be backported to 2.78 release.
2016-09-08 15:08:35 +02:00
27e2317513 Cycles: Add asserts to BVH node packing 2016-09-08 15:03:55 +02:00
3598a3d1d5 Cycles: Cleanup: line wrapping 2016-09-08 14:26:10 +02:00
ca5271e637 Cycles: Fix wrong allocator used for spatial builder 2016-08-18 20:47:48 +02:00
ac061de20d Cycles: Fix refitting of regular BVH
Was causing CUDA issues on viewport edits.
2016-07-15 18:12:34 +02:00
b03e66e75f Cycles: Implement unaligned nodes BVH builder
This is a special builder type which is allowed to orient nodes to
strands direction, hence minimizing their surface area in comparison
with axis-aligned nodes. Such nodes are much more efficient for hair
rendering.

Implementation of BVH builder is based on Embree, and generally idea
there is to calculate axis-aligned SAH and oriented SAH and if SAH
of oriented node is smaller than axis-aligned SAH we create unaligned
node.

We store both aligned and unaligned nodes in the same tree (which
seems to be different from what Embree is doing) so we don't have
any any extra calculations needed to set up hair ray for BVH
traversal, hence avoiding any possible negative effect of this new
BVH nodes type.

This new builder is currently not in use, still need to make BVH
traversal code aware of unaligned nodes.
2016-07-07 17:25:48 +02:00
1a2012145d Cycles: Switch node address to absolute values in BVH tree
This seems to be straightforward way to support heterogeneous nodes
in the same tree.

There is some penalty related on 4gig limit of the address space now,
but here's are the thing:

Traversal code was already using ints to store final offset, so
there can't be regressions really.

This is a required commit to make it possible to encode both aligned
and unaligned nodes in the same array. Also, in the future we can use
this to get rid of __leaf_nodes array (which is a bit tricky to do since
trickery in pack_instances().
2016-07-07 17:25:48 +02:00
17e7454263 Cycles: Reduce memory usage by de-duplicating triangle storage
There are several internal changes for this:

First idea is to make __tri_verts to behave similar to __tri_storage,
meaning, __tri_verts array now contains all vertices of all triangles
instead of just mesh vertices. This saves some lookup when reading
triangle coordinates in functions like triangle_normal().

In order to make it efficient needed to store global triangle offset
somewhere. So no __tri_vindex.w contains a global triangle index which
can be used to read triangle vertices.

Additionally, the order of vertices in that array is aligned with
primitives from BVH. This is needed to keep cache as much coherent as
possible for BVH traversal. This causes some extra tricks needed to
fill the array in and deal with True Displacement but those trickery
is fully required to prevent noticeable slowdown.

Next idea was to use this __tri_verts instead of __tri_storage in
intersection code. Unfortunately, this is quite tricky to do without
noticeable speed loss. Mainly this loss is caused by extra lookup
happening to access vertex coordinate.

Fortunately, tricks here and there (i,e, some types changes to avoid
casts which are not really coming for free) reduces those losses to
an acceptable level. So now they are within couple of percent only,

On a positive site we've achieved:

- Few percent of memory save with triangle-only scenes. Actual save
  in this case is close to size of all vertices.

  On a more fine-subdivided scenes this benefit might become more
  obvious.

- Huge memory save of hairy scenes. For example, on koro.blend
  there is about 20% memory save. Similar figure for bunny.blend.

This memory save was the main goal of this commit to move forward
with Hair BVH which required more memory per BVH node. So while
this sounds exciting, this memory optimization will become invisible
by upcoming Hair BVH work.

But again on a positive side, we can add an option to NOT use Hair
BVH and then we'll have same-ish render times as we've got currently
but will have this 20% memory benefit on hairy scenes.
2016-07-07 17:25:48 +02:00
1eacbf47e3 Cycles: Support visibility check for inner nodes of QBVH
It was initially unsupported because initial idea of checking visibility
of all children was slowing scenes down a lot. Now the idea has changed
and we only perform visibility check of current node. This avoids huge
slowdown (from tests here it seems to be withing 1-2%, but more tests
would never hurt) and gives nice speedup of ray traversal for complex
scenes which utilized ray visibility.

Here's timing of koro.blend:

                  Without visibility check         With visibility check
Original file           4min 20sec                      4min 23sec
Camera rays only        1min 43 sec                       55sec

Unfortunately, this doesn't come for free and requires extra data in
BVH node, which increases memory usage of BVH nodes by 15%. This we
can solve with some future trickery of avoiding __tri_storage created
for curve segments.
2016-07-07 17:25:48 +02:00
39ae324918 Cycles: remove extended precision hacks, no longer needed with SSE2 requirement.
Differential Revision: https://developer.blender.org/D2079
2016-07-04 18:22:11 +02:00
6307e823ca Cycles: Fix crash after recent zero scale instance optimization 2016-06-08 12:25:35 +02:00
b6954c8da1 Cycles: Fix regression introduced in c96a4c8
A few places still needed to be updated to use the new Mesh::num_triangles()
method; wrong number from triangles.size() was causing crashes.
2016-06-07 07:38:09 -04:00
6046c03f5c Cycles: Ignore zero size instances in BVH
In certain types of animation it's possible to have some objects
scaling to zero. In this case we can save render times by avoid
traversing such instances.

Better to do ti ahead of a time, so traversal stays simple.

Reviewers: lukasstockner97, dingto, brecht

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2048
2016-06-06 09:23:53 +02:00
4388b29e98 Cycles: Add human readable sizes to debug output
Some of these values can get quite large and are hard to read, adding this
makes it easy to read them at a glance.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D2039
2016-05-31 06:13:54 -04:00
c96a4c8a2a Code refactor: modify mesh storage to use arrays rather than vectors, separate some arrays.
Differential Revision: https://developer.blender.org/D2016
2016-05-28 18:31:00 +02:00
ec51175f1f Code refactor: add generic Cycles node infrastructure.
Differential Revision: https://developer.blender.org/D2016
2016-05-22 17:29:24 +02:00
9dc5367c89 Cleanup code style inconsistency in last commits. 2016-05-17 23:41:45 +02:00
2b73402547 Fix C++11 build issues on OS X, remove references to outdated libs. 2016-05-17 21:39:16 +02:00
92774ff792 Cycles: Use explicit qualifier for single-argument constructors
Almost in all cases we want such constructors to be explicit, there are
exceptions but only in few places.
2016-05-11 16:51:14 +02:00
875df1e2b9 Cycles: Fix hair minimal size doesn't work on GPU and SSE2 only CPUs 2016-05-04 17:14:43 +02:00
6a7378f50f Cycles: Proper pack of leaves which are bigger than single float4 2016-04-25 18:57:37 +02:00
34f4c31692 Cycles: Move vector re-allocation out of loops 2016-04-25 12:25:30 +02:00
d7e4f920fd Cycles: Use threads to sort reference arrays when searching for split
This commits implements threaded sorting of references when looking for
object spatial split. It mainly useful when doing initial binning, which
happens from main thread.

Gives nice speedup of BVH build for Bunny.blend: 36sec vs. 55sec for
the Rabbit mesh BVH build.

On more complex scenes the speedup is probably minimal, but still nice
to have more instant rendering for simplier scenes.

Some further tests with production scenes would be interesting.

Reviewers: juicyfruit, dingto, lukasstockner97, brecht

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D1928
2016-04-20 15:21:44 +02:00
96dc96f0a5 Cycles: Avoid reference copy 2016-04-20 15:03:39 +02:00
3130f4167d Cycles: Optimization to spatial BVH build
Simple fix: release lock earlier.

Reduces spatial split build time from 96 to 53sec on the Bunny.blend
(using studio Intel for benchmark).

NOTE: Timing difference is not that spectacular when comparing numbers
with builds before memory optimization, but even then it's about 20%
faster build.
2016-04-15 14:35:42 +02:00
bbcb9c68c9 Cycles: Resolve ridiculous amount of memory used by spatial split builder
This was only visible on systems with lots of threads and root of the issue
was that we've been pre-allocating too much memory for all the threads.

Now we only pre-allocate data for the main thread and rest of the threads
does allocation on-demand.

This brings down memory usage from 36Gig to 6.9Gig when building spatial
split for the Bunny.blend file on our Intel beast.

Originally regression was happened by the threaded spacial split builder
commit.
2016-04-13 14:25:21 +02:00
e4cdda548a Cycles: Remove unused SAH from BVH pack 2016-04-11 17:18:14 +02:00
6cd13a221f Cycles: Rename tri_woop to tri_storage
It's no longer a pre-computed data and just a storage of triangle
coordinates which are faster to access to.
2016-04-11 17:18:14 +02:00
e10ec6ee9a Cycles: Avoid possibly uninitialized variable 2016-04-06 10:51:04 +02:00
9c952bbe85 Cleanup: Typo fixes after BVH commits. 2016-04-05 01:20:45 +02:00
a3d6552514 Cycles: Fix regular BVH not having proper visibility flags
This was caused by recent threading commit. Now because of all children
are set when they're ready need to explicitly update all parent's visibility.
2016-04-04 18:11:34 +02:00
bf55afbf26 Cycles: Make spatial split BVH multi-threaded
The title actually covers it all, This commit exploits all the work
being done in previous changes to make it possible to build spatial
splits in threads.

Works quite nicely, but has a downside of some extra memory usage.
In practice it doesn't seem to be a huge problem and that we can
always look into later if it becomes a real showstopper.

In practice it shows some nice speedup:

- BMW27 scene takes 3 now (used to be 4)
- Agent shot takes 5 sec (used to be 80)

Such non-linear speedup is most likely coming from much less amount
of heap re-allocations. A a downside, there's a bit of extra memory
used by BVH arrays. From the tests amount of extra memory is below
0.001% so far, so it's not that bad at all.

Reviewers: brecht, juicyfruit, dingto, lukasstockner97

Differential Revision: https://developer.blender.org/D1820
2016-04-04 14:43:21 +02:00
be2186ad62 Cycles: Solve possible issues with running out of stack memory allocator
Policy here is a bit more complicated, if tree becomes too deep we're
forced to create a leaf node and size of that leaf wouldn't be so well
predicted, which means it's quite tricky to use single stack array for
that.

Made it more official feature that StackAllocator will fall-back to
heap when running out of stack memory.

It's still much better than always using heap allocator.
2016-04-04 14:13:19 +02:00
ba7c2b7b73 Cycles: Log allocation slop factor for BVH arrays
Currently they're staying at 1 (actual size over capacity), but we
will be changing it quite soon in order to avoid having too much
memory re-allocation happening at a BVH build time and will be
playing with different policies for that.
2016-04-04 12:56:56 +02:00
61a8d12ccd Cycles: Tweak to stack allocator used by BVH builder
In some files stack memory was overruning the pre-allocated stack.

Perhaps we should fall-back to a hep-allocated stack so release builds
don't crash in works case but just becoming slower.
2016-04-04 12:23:23 +02:00
0f6f921898 Cycles: Temporarily revert index sort commit for spatial split
There are in fact some missing parts to it (Split BVH builder should
be creating bins from result of Object Split constructor).

Doable, but need to quickly fix issue for the studio here, easier to
revert for now.
2016-04-01 17:45:59 +02:00
6cc04b408c Cycles: Fix compilation on Win32 after bitscan commit
Need to revisit utility headers a bit more carefully and perhaps
move such utilities outside of simd-related headers.
2016-03-31 16:47:57 +02:00
791a0852e8 Cycles: Name cleanup and some comments in BVH code 2016-03-31 13:52:38 +02:00
63d017be90 Cycles: Avoid per-split memory allocation for the new references list 2016-03-31 10:06:21 +02:00
e69a0ab5fc Cycles: Pass BVH builder by const reference to spatial splitters 2016-03-31 10:06:21 +02:00
d9b729e342 Cycles: Only sort indices when finding a best dimension to split
This reduces amount of data being moved back and forth, which should
have positive effect on the performance.
2016-03-31 10:06:21 +02:00
bbbbe68473 Cycles: Wrap spatial split storage into own structure
This has following advantages:

- Localizes all the run-time storage into a single structure,
  which could easily be extended further.

- Storage could be created per-thread, so once builder is
  threaded we wouldn't have any conflicts between threads.

- Global nature of the storage avoids memory re-allocation
  on the runtime, keeping builder as fast as possible.

Currently it's just API changes, which don't affect user at all.
2016-03-31 10:06:21 +02:00
9c420e5e48 Cycles: Use stack storage for temporary data on leaf creation
Uses new StackAllocator from util_stack_allocator. Some tweaks to the stack
storage size are possible, read notes in the code about this.

At this point we might want to rename allocator files to util_allocator_foo.c,
so the stay nicely grouped in the folder.
2016-03-31 10:06:21 +02:00
65b375e798 Cycles: Move non-vectorized bitscan() to util
This way we can use bitscan() from both vectorized and non-vectorized
code, which applies to both kernel and host code.
2016-03-31 10:06:21 +02:00
0e47e0cc9e Cycles: Use dedicated BVH for subsurface ray casting
This commit makes it so casting subsurface rays will totally ignore all
the BVH nodes and primitives which do not belong to a current object,
making it much simpler traversal code and reduces number of intersection
tests.

Reviewers: brecht, juicyfruit, dingto, lukasstockner97

Differential Revision: https://developer.blender.org/D1823
2016-03-25 13:42:13 +01:00
833eb863eb Fix (harmless) assert in BVH spatial splits with Visual Studio debug builds. 2016-02-17 00:07:35 +01:00