blender-archive

Archived

Author	SHA1	Message	Date
Sergey Sharybin	3637cbbcf8	Cycles: Fix wrong termination criteria in intersect_all functions It was possible to miss bounces termination criteria in this functions, mainly when max_hits was set to 0. Made the check more robust in traversal functions (which should not affect performance, it's an operation of same complexity AFAIK). Also avoid doing ray-scene intersection from shadow_blocked when limit of transparent bounces was already reached.	2016-07-14 11:26:20 +02:00
Sergey Sharybin	c06d3b6c36	Cycles: Fix compilation error on Windows with OSL enabled Seems there's some conflict around `near` identifier in that configuration.	2016-07-11 18:15:51 +02:00
Sergey Sharybin	7602b6bf62	Cycles: Fix typo	2016-07-11 18:01:40 +02:00
Sergey Sharybin	ea32a03801	Fix T48824: Crash when having too many ray-to-volume intersections Code might have writing past the array boundaries.	2016-07-11 17:59:46 +02:00
Sergey Sharybin	b99f7a9b2a	Cycles: Fix Extend image extension mode on OpenCL	2016-07-11 14:46:42 +02:00
Sergey Sharybin	cb3b19730c	Cycles: Use utility define for restrict pointers This way restrict can be used for CUDA and OpenCL as well. From quick tests in areas i've been testing this it might give some barely measurable %% of speedup, but it increases registers pressure. So use of this qualifier is still really limited.	2016-07-11 13:58:47 +02:00
Sergey Sharybin	cf82b49a0f	Cycles: Cleanup, variables name Using camel case for variables is something what didn't came from our original code, but rather from third party libraries. Let's avoid those as much as possible.	2016-07-11 13:58:47 +02:00
Sergey Sharybin	2ecbc3b777	Cycles: Add _all suffix to shadow traversal file Matches better naming of volume traversal files, where we've got optimized versions of a single step of volume intersection and traversal which will gather all volume intersections.	2016-07-11 13:58:47 +02:00
Sergey Sharybin	4355603790	Cycles: Move BVK kernel files to own directory BVH traversal is not really that much a geometry and we've got quite some traversals now. Makes sense to keep them separate in the name of source structure clarity.	2016-07-11 13:58:47 +02:00
Sergey Sharybin	a62967787c	Fix T48808: Regression: Cycles OpenCL broken after Hair BVH commit	2016-07-08 09:41:36 +02:00
Sergey Sharybin	a08e2179f1	Cycles: Implement unaligned nodes BVH traversal This commit implements traversal of unaligned BVH nodes. QBVH traversal is fully SIMD optimized and calculates orientation for all 4 children at a time, regular BVH might probably be optimized a bit more.	2016-07-07 17:25:48 +02:00
Sergey Sharybin	b03e66e75f	Cycles: Implement unaligned nodes BVH builder This is a special builder type which is allowed to orient nodes to strands direction, hence minimizing their surface area in comparison with axis-aligned nodes. Such nodes are much more efficient for hair rendering. Implementation of BVH builder is based on Embree, and generally idea there is to calculate axis-aligned SAH and oriented SAH and if SAH of oriented node is smaller than axis-aligned SAH we create unaligned node. We store both aligned and unaligned nodes in the same tree (which seems to be different from what Embree is doing) so we don't have any any extra calculations needed to set up hair ray for BVH traversal, hence avoiding any possible negative effect of this new BVH nodes type. This new builder is currently not in use, still need to make BVH traversal code aware of unaligned nodes.	2016-07-07 17:25:48 +02:00
Sergey Sharybin	1a2012145d	Cycles: Switch node address to absolute values in BVH tree This seems to be straightforward way to support heterogeneous nodes in the same tree. There is some penalty related on 4gig limit of the address space now, but here's are the thing: Traversal code was already using ints to store final offset, so there can't be regressions really. This is a required commit to make it possible to encode both aligned and unaligned nodes in the same array. Also, in the future we can use this to get rid of __leaf_nodes array (which is a bit tricky to do since trickery in pack_instances().	2016-07-07 17:25:48 +02:00
Sergey Sharybin	17e7454263	Cycles: Reduce memory usage by de-duplicating triangle storage There are several internal changes for this: First idea is to make __tri_verts to behave similar to __tri_storage, meaning, __tri_verts array now contains all vertices of all triangles instead of just mesh vertices. This saves some lookup when reading triangle coordinates in functions like triangle_normal(). In order to make it efficient needed to store global triangle offset somewhere. So no __tri_vindex.w contains a global triangle index which can be used to read triangle vertices. Additionally, the order of vertices in that array is aligned with primitives from BVH. This is needed to keep cache as much coherent as possible for BVH traversal. This causes some extra tricks needed to fill the array in and deal with True Displacement but those trickery is fully required to prevent noticeable slowdown. Next idea was to use this __tri_verts instead of __tri_storage in intersection code. Unfortunately, this is quite tricky to do without noticeable speed loss. Mainly this loss is caused by extra lookup happening to access vertex coordinate. Fortunately, tricks here and there (i,e, some types changes to avoid casts which are not really coming for free) reduces those losses to an acceptable level. So now they are within couple of percent only, On a positive site we've achieved: - Few percent of memory save with triangle-only scenes. Actual save in this case is close to size of all vertices. On a more fine-subdivided scenes this benefit might become more obvious. - Huge memory save of hairy scenes. For example, on koro.blend there is about 20% memory save. Similar figure for bunny.blend. This memory save was the main goal of this commit to move forward with Hair BVH which required more memory per BVH node. So while this sounds exciting, this memory optimization will become invisible by upcoming Hair BVH work. But again on a positive side, we can add an option to NOT use Hair BVH and then we'll have same-ish render times as we've got currently but will have this 20% memory benefit on hairy scenes.	2016-07-07 17:25:48 +02:00
Sergey Sharybin	1eacbf47e3	Cycles: Support visibility check for inner nodes of QBVH It was initially unsupported because initial idea of checking visibility of all children was slowing scenes down a lot. Now the idea has changed and we only perform visibility check of current node. This avoids huge slowdown (from tests here it seems to be withing 1-2%, but more tests would never hurt) and gives nice speedup of ray traversal for complex scenes which utilized ray visibility. Here's timing of koro.blend: Without visibility check With visibility check Original file 4min 20sec 4min 23sec Camera rays only 1min 43 sec 55sec Unfortunately, this doesn't come for free and requires extra data in BVH node, which increases memory usage of BVH nodes by 15%. This we can solve with some future trickery of avoiding __tri_storage created for curve segments.	2016-07-07 17:25:48 +02:00
Brecht Van Lommel	39ae324918	Cycles: remove extended precision hacks, no longer needed with SSE2 requirement. Differential Revision: https://developer.blender.org/D2079	2016-07-04 18:22:11 +02:00
Brecht Van Lommel	8cc123a387	Fix T48783: OSL render errors after recent refactoring.	2016-07-03 13:08:21 +02:00
Campbell Barton	9f5621bb4a	Cleanup: comment blocks	2016-07-02 10:08:33 +10:00
Thomas Dinges	5c249fac9a	Fix Cycles OpenCL not taking Extend and Clip extension types into account. (See T48720).	2016-07-01 23:48:31 +02:00
Sergey Sharybin	23cc453975	Fix T48732: New GGX breaks OpenCL kernel Make sure we don't perform any implicit address space conversion. A bit annoying, but less intrusive approaches (like using temp private variable in .cl kernel) do not work correct here. Using generic address space will help from code side here, but will be somewhat slower due to extra things happening as far as i know.	2016-06-28 17:15:35 +05:00
Lukas Stockner	2a69b09b62	Fix T48732 v2: New GGX breaks OpenCL kernel As far as I can see, the second issue there was that the functions receive a pointer to a member variable of the ShaderData, which is stored in global memory. However, this means that the pointer points to global memory as well, therefore OpenCL requires the ccl_addr_space "keyword" in front of the pointer. With this commit, the OpenCL kernels build on Linux with the Intel CPU OpenCL runtime - however, they already did without the change and I don't have an AMD card, so I can't really test whether the AMD runtime is happy as well now.	2016-06-26 00:51:16 +02:00
Thomas Dinges	99088f8b55	Fix T48732, OpenCL compile failure after Multiscatter GGX commit. Use OpenCL "all" builtin type for conversion, according to OpenCL 1.1 spec 6.3e.	2016-06-25 11:14:06 +02:00
Lukas Stockner	23c276832b	Cycles: Add multi-scattering, energy-conserving GGX as an option to the Glossy, Anisotropic and Glass BSDFs This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model". Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until the ray leaves it again, which ensures perfect energy conservation. In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing roughness - is solved in a physically correct and efficient way. The downside of this model is that it has no (known) analytic expression for evalation. However, it can be evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the balance heuristic guarantee an unbiased result at the cost of slightly higher noise. Reviewers: dingto, #cycles, brecht Reviewed By: dingto, #cycles, brecht Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel Differential Revision: https://developer.blender.org/D2002	2016-06-23 22:57:26 +02:00
Lukas Stockner	028ba31903	Fix T48698: Rays from SSS act as diffuse for normal objects but have an undefined type for lamp objects The problem here was that there are five path types internally (diffuse, glossy, transmission, subsurface and volume scatter), but subsurface isn't exposed to the user. This caused some weird behaviour - if all four types are disabled on the lamp, Cycles doesn't even try sampling it, but if any type was active, the lamp would illuminate the cube since none of the options set subsurface to zero. In the future, it might be reasonable to add subsurface visibility as an option - but for now the weird and inconsistent behaviour can be fixed simply by setting both diffuse and subsurface to zero if the user disables diffuse visibility.	2016-06-21 20:09:48 +02:00
Lukas Stockner	29ce3dfeb6	Fix T48691: Cycles - OpenCL - HDR Image mapping does not match CUDA rendering The OpenCL texture code didn't offset the coordinates by half a pixel like the CPU code does.	2016-06-21 00:49:25 +02:00
Lukas Stockner	dfa7ddd4a8	Cycles: Add svm_util_color.h file to CMake The file wasn't included in CMake and therefore not installed into the addon folder.	2016-06-20 19:07:22 +02:00
Alexander Gavrilov	f7bada00a7	Cycles: add constant folding for more color operation nodes. Invert, brightness & constrast, separate/combine and Mix RGB blend modes and clamping.	2016-06-19 20:17:28 +02:00
Alexander Gavrilov	98547e8817	Fix Cycles RGB and Vector Curves node Fac handling.	2016-06-19 20:17:27 +02:00
Brecht Van Lommel	e26eb9c93b	Cycles: reduce CUDA stack memory access for Maxwell and up, increasing max registers. For non-branched path tracing with a GTX 960 and CUDA 7.5, this gives a small reduction in stack usage but mainly: 8% faster render on BMW, 5% on pabellon, 13% on classroom.	2016-06-19 20:17:26 +02:00
Thomas Dinges	6311a9ff23	Cycles: Support half and half4 textures. This is an initial commit for half texture support in Cycles. It adds the basic infrastructure inside of the ImageManager and support for these textures on CPU. Supported: * Half Float OpenEXR images (can be used for e.g HDRs or Normalmaps) now use 1/2 the memory, when loaded via disk (OIIO). ToDo: Various things like support for inbuilt half textures, GPU... will come later, step by step. Part of my GSoC 2016.	2016-06-19 17:31:16 +02:00
Sergey Sharybin	7ac126e728	Fix T46492: GGX distribution produces black pixels The issue was caused by some numerical instability.	2016-06-17 16:30:29 +02:00
Thomas Dinges	a3a7e46318	Cleanup: Remove outdated comment, visibility layers in kernel have been removed.	2016-06-14 16:46:44 +02:00
Brecht Van Lommel	ebdd2e0b6d	Cycles: make shader node enums consistently lower case, update OSL shaders accordingly.	2016-06-11 23:50:11 +02:00
Lukas Stockner	654019fa01	Cycles: Fix two numerical issues in the volume code This hopefully fixes T48383 by avoiding two numerical problems that I found in the volume code. Reviewers: sergey, dingto, brecht Reviewed By: sergey, dingto, brecht Maniphest Tasks: T48383 Differential Revision: https://developer.blender.org/D2051	2016-06-08 03:17:19 +02:00
Sergey Sharybin	b595a692c8	Cycles: Limit degenerated triangle check got CUDA only OpenCL seems to work fine here, and for some reason that comparison was giving compilation error on OpenCL here. Better to compile OpenCL kernel than to be fully robust to weird corner cases.	2016-06-07 15:48:56 +02:00
Lukas Stockner	7a5a02509b	Cycles: Use faster ray-quad-intersection test The original quad intersection test works by just testing against the two triangles that define the quad. However, in this case it's actually faster to use the same test that's also used for portals: Determining the distance to the plane in which the quad lies, calculating the hitpoint and checking whether it's in the quad by projecting onto the sides. Reviewers: brecht, sergey, dingto Reviewed By: dingto Differential Revision: https://developer.blender.org/D2045	2016-06-06 23:38:50 +02:00
Sergey Sharybin	14f9a5aa1d	Fix T48571: Cycles/GPU - A lot of fireflies on SSS+Volume Was some accumulated precision error happening.	2016-06-06 15:56:22 +02:00
Martijn Berger	50f432b1e0	CMake, minor changes to make Visual studio 2015 use a compatible numpy and the standard cmake CUDA/NVCC arguments flag allowing 2015 build to use msvc 2013 for cuda	2016-06-04 11:42:48 +02:00
Sergey Sharybin	f54a98a1c5	Cycles: Simplify check for degenerated faces on GPU Still not sure how to properly solve the issue, needs some trickery to get actual optimized values from intersection function (using printf() avoids some optimization and makes stuff render correct). For the time being let's just simplify check.	2016-06-03 10:36:04 +02:00
Sergey Sharybin	d2bb0e660b	Fix T46207: Slow OpenCL GPU bake and blown out baking Cycles render	2016-05-31 17:48:42 +02:00
Sergey Sharybin	a2aa44370b	Fix T48556: Missing transparent shadows on AMD OpenCL We had transparent shadows disabled for some time because they were causing drivers to crash. Can't reproduce that issue anymore with current drivers, so will enable them and see how it goes.	2016-05-31 11:48:07 +02:00
Brecht Van Lommel	9bd2820aaf	Code refactor: add separate RGB to BW node and rename some sockets.	2016-05-29 20:30:16 +02:00
Thomas Dinges	2ee063868d	Cleanup: Shorten texture variables, tex and image was kinda redundant. Also make prefix consistent, so it starts with either TEX_NUM or TEX_START, followed by texture type and architecture.	2016-05-27 22:58:33 +02:00
Sergey Sharybin	7dae62cde0	Cycles: Simplify code around debug stats in BVH traversing	2016-05-27 10:55:48 +02:00
Brecht Van Lommel	f2ba13964d	Fix T48514: Cycles toon glossy BSDF not respecting reflective caustics option.	2016-05-25 21:13:24 +02:00
Brecht Van Lommel	b49185df99	Cycles CUDA: reduce branched path stack memory by sharing indirect ShaderData. Saves about 15% for the branched path kernel.	2016-05-25 21:13:24 +02:00
Sergey Sharybin	42b26206c6	Fix T48508: Cycles Regression / Crash	2016-05-24 14:53:34 +02:00
Brecht Van Lommel	999d5a6785	Cycles CUDA: reduce stack memory by reusing ShaderData. 57% less for path and 48% less for branched path.	2016-05-23 22:29:24 +02:00
Sergey Sharybin	2aa4b6045a	Cycles: Fix wrong closure counter in feature adaptive kernel Some closures were missing from calculation, leading to an array under-allocation, presumable causing memory corruption issues with emission shaders on OpenCL and was causing issues with Volume 3D textures with CUDA. The issue was identified by Thomas Dinges, the patch is different from the original D2006. See the brief discussion there. Current approach is similar (or the same) as Brecht suggested.	2016-05-23 14:09:27 +02:00
Brecht Van Lommel	ca03eddfcc	Cleanup: remove Cycles layer bits checking in the kernel. At some point the idea was that we could have an optimization where we could render multiple render layers without re-exporting the scene, by just updating the layer bits. We are not doing this now and in practice with the available render layer control like exclude layers it's not always possible anyway. This makes it easier to support an arbitrary number of layers in the future (hopefully this summer), and frees up some useful bits in the kernel. Reviewed By: sergey, dingto Differential Revision: https://developer.blender.org/D2020	2016-05-22 17:36:38 +02:00

... 9 10 11 12 13 ...

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

1960 Commits