blender-archive

Archived

Author	SHA1	Message	Date
Brecht Van Lommel	d1ef5146d7	Cycles: remove SIMD BVH optimizations, to be replaced by Embree Ref T73778 Depends on D8011 Maniphest Tasks: T73778 Differential Revision: https://developer.blender.org/D8012	2020-06-22 13:28:01 +02:00
Patrick Mours	bd2b1a67a7	Fix T74939: Random Walk subsurface appearance in OptiX does not match other engines Random Walk subsurface scattering did look different with OptiX because transmittance is calculated based on the hit distance, but the OptiX implementation of `scene_intersect_local` would return the distance in world space, while the Cycles BVH version returns it in object space. This fixes the problem by simply skipping the object->world transforms in all the places using the result of `scene_intersect_local` with OptiX. Reviewed By: brecht Differential Revision: https://developer.blender.org/D7232	2020-03-26 13:00:09 +01:00
Sergey Sharybin	9ffb87c629	Fix T66296: Black artefacts on materials with refraction on CPU The issue was in the optimization code path for opaque shadow rays which was wrongly considering all primitives in the node to have same visibility flags.	2019-07-05 15:48:50 +02:00
Campbell Barton	c47d669f24	Cleanup: comments (long lines) in cycles	2019-05-01 21:41:07 +10:00
Campbell Barton	e12c08e8d1	ClangFormat: apply to source, most of intern Apply clang format as proposed in T53211. For details on usage and instructions for migrating branches without conflicts, see: https://wiki.blender.org/wiki/Tools/ClangFormat	2019-04-17 06:21:24 +02:00
Campbell Barton	e742e0934d	Cleanup: trailing space	2018-11-25 08:01:14 +11:00
Sergey Sharybin	968bf0df14	Fix T57811: Render crashes in certain scenes when AO Bounces are used	2018-11-21 14:17:26 +01:00
Sergey Sharybin	6f48bfc7a8	Cycles: Cleanup, use utility function Replaces inlined platform-specific code.	2018-11-21 13:51:18 +01:00
Sergey Sharybin	65143542af	Cycles: Cleanup, reduce indentation level	2018-11-21 12:41:24 +01:00
Sergey Sharybin	700330afe8	Cycles: Cleanup, comments and dead code	2018-11-21 11:33:11 +01:00
Sergey Sharybin	65d01def80	Cycles: Cleanup, CUDA code path is not possible inside AVX2	2018-11-21 11:28:49 +01:00
Sergey Sharybin	cb4b5e12ab	Cycles: Cleanup, spacing after preprocessor It is supposed to be two spaces before comment stating which if else/endif statements corresponds to. Was mainly violated in the header guards.	2018-11-09 11:34:54 +01:00
Brecht Van Lommel	ddf8c49736	Fix Cycles CUDA build after recent changes.	2018-08-29 16:35:21 +02:00
Sergey Sharybin	73f2056052	Cycles: Add BVH8 and packeted triangle intersection This is an initial implementation of BVH8 optimization structure and packated triangle intersection. The aim is to get faster ray to scene intersection checks. Scene BVH4 BVH8 barbershop_interior 10:24.94 10:10.74 bmw27 02:41.25 02:38.83 classroom 08:16.49 07:56.15 fishy_cat 04:24.56 04:17.29 koro 06:03.06 06:01.45 pavillon_barcelona 09:21.26 09:02.98 victor 23:39.65 22:53.71 As memory goes, peak usage raises by about 4.7% in a complex scenes. Note that BVH8 is disabled when using OSL, this is because OSL kernel does not get per-microarchitecture optimizations and hence always considers BVH3 is used. Original BVH8 patch from Anton Gavrikov. Batched triangles intersection from Victoria Zhislina. Extra work and tests and fixes from Maxym Dmytrychenko.	2018-08-29 15:03:09 +02:00
Lukas Stockner	799779d432	Cycles: change Ambient Occlusion shader to output colors. This means the shader can now be used for procedural texturing. New settings on the node are Samples, Inside, Local Only and Distance. Original patch by Lukas with further changes by Brecht. Differential Revision: https://developer.blender.org/D3479	2018-06-15 22:16:06 +02:00
Brecht Van Lommel	0df9b2c715	Cycles: random walk subsurface scattering. It is basically brute force volume scattering within the mesh, but part of the SSS code for faster performance. The main difference with actual volume scattering is that we assume the boundaries are diffuse and that all lighting is coming through this boundary from outside the volume. This gives much more accurate results for thin features and low density. Some challenges remain however: * Significantly more noisy than BSSRDF. Adding Dwivedi sampling may help here, but it's unclear still how much it helps in real world cases. * Due to this being a volumetric method, geometry like eyes or mouth can darken the skin on the outside. We may be able to reduce this effect, or users can compensate for it by reducing the scattering radius in such areas. * Sharp corners are quite bright. This matches actual volume rendering and results in some other renderers, but maybe not so much real world objects. Differential Revision: https://developer.blender.org/D3054	2018-02-09 19:58:33 +01:00
Brecht Van Lommel	f79f386731	Code refactor: rename subsurface to local traversal, for reuse.	2017-11-07 22:35:12 +01:00
Sergey Sharybin	6ea54fe9ff	Cycles: Switch to reformulated Pluecker ray/triangle intersection The intention of this commit it to address issues mentioned in the reports T43865,T50164 and T50452. The code is based on Embree code with some extra vectorization to speed up single ray to single triangle intersection. Unfortunately, such a fix is not coming for free. There is some slowdown for AVX2 processors, mainly due to different vectorization code, which caused different number of instructions to be executed and different instructions-per-cycle counters. But on another hand this commit makes pre-AVX2 platforms such as AVX and SSE4.1 a bit faster. The prerformance goes as following: 2.78c AVX2 2.78c AVX Patch AVX2 Patch AVX BMW 05:21.09 06:05.34 05:32.97 (+3.5%) 05:34.97 (-8.5%) Classroom 16:55.36 18:24.51 17:10.41 (+1.4%) 17:15.87 (-6.3%) Fishy Cat 08:08.49 08:36.26 08:09.19 (+0.2%) 08:12.25 (-4.7% Koro 11:22.54 11:45.24 11:13.25 (-1.5%) 11:43.81 (-0.3%) Barcelone 14:18.32 16:09.46 14:15.20 (-0.4%) 14:25.15 (-10.8%) On GPU the performance is about 1.5-2% slower in my tests on GTX1080 but afraid we can't do much as a part of this chaneg here and consider it a price to pay for more proper intersection check. Made in collaboration with Maxym Dmytrychenko, big thanks to him! Reviewers: brecht, juicyfruit, lukasstockner97, dingto Differential Revision: https://developer.blender.org/D1574	2017-03-28 17:26:47 +02:00
Sergey Sharybin	27248c8636	Cycles: Remove unused macro	2017-03-23 17:59:02 +01:00
Sergey Sharybin	a1348dde2e	Cycles: Fix speed regression on GPU Avoid construction of temporary array and make utility function force-inlined. Additionally avoid calling float4_to_float3 twice. This brings render times to the same values as before current patch series.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	2a5d7b5b1e	Cycles: Use utility function for SSS triangle intersection This effectively de-duplicates triangle intersection logic implemented for both regular triangle and SSS triangle.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	a5b6742ed2	Cycles: Move watertight triangle intersection to an utility file This way the code can be reused more easily.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	f8a999c965	Cycles: Move triangle intersection precalc to an util file This is a preparation work for the followup commit which wil l move remaining parts of Woop intersection logic to an utility file. Doing it as a separate commit to keep changes more atomic and easier to bisect when/if needed.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	b797a5ff78	Cycles: Cleanup, move utility function to utility file Was an old TODO, this function is handy for some math utilities as well.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	e8ff06186e	Cycles: Cleanup, inline AVX register construction from kernel global data Currently should be no functional changes, preparing for some upcoming refactor.	2017-03-23 17:45:19 +01:00
Mai Lavelle	352ee7c3ef	Cycles: Remove ccl_fetch and SOA	2017-03-08 00:52:41 -05:00
Sergey Sharybin	968e01d407	Cycles: Cleanup, variable names Use underscore again and also solve confusing part then in BVH smae thing is called prim_addr but in intersection funcitons it was called triAddr.	2016-12-12 12:10:37 +01:00
Sergey Sharybin	b21938f3d4	Cycles: Cleanup, variables names Use underscore instead of camel case.	2016-12-12 10:19:49 +01:00
Sergey Sharybin	de22e55291	Cycles: Fix compilation error of AVX2 kernel without SSE math	2016-10-26 20:49:33 +02:00
Sergey Sharybin	81c9e0d295	Cycles: Avoid branching in SSE version of intersection pre-calculation Similar to the previous commit, avoid negative effect of bad branch prediction. Gives measurable performance up to ~2% in tests here. Once again, thanks to Maxym Dmytrychenko!	2016-10-25 14:18:32 +02:00
Sergey Sharybin	10a25b655a	Cycles: Add AVX2 path to subsurface triangle intersection Similar to regular triangle intersection case. Gives about 3% speedup rendering SSS object on my desktop, Question: how to avoid such a code duplication in a nice way without speed loss?	2016-10-24 16:56:41 +02:00
Sergey Sharybin	42aeb608e7	Cycles: Implement AVX2 version of triangle_intersect This commit basically vectorizes existing code using AVX2 instructions (without modifying algorithm itself). This gives quite nice speedups: BMW: -8% Classroom: -5% Cat: -5% Koro: +1% Barcelona: -8% That's on Linux machine, reported performance improvement on Windows goes up to 20%. Not currently sure why Koro is somewhat slower because it mainly uses curve intersection tests, could be a time noise? Or osmething with the cache utilization perhaps? In any case speedup in other scenes makes me thinking that current state is acceptable for initial implementation. This is again inspired by Maxym Dmytrychenko.	2016-10-12 14:11:55 +02:00
Campbell Barton	710ab5be36	Cleanup: spelling, style	2016-07-31 17:41:05 +10:00
Lukas Stockner	d9cc3ea2c6	Cycles: Fix rays parallel to the surface in the triangle refine and MultiGGX code In the triangle intersection refinement code, rays that are parallel to the triangle caused a divide by zero. These rays might initially hit the triangle due to the watertight intersection test, but are very rare - therefore, just skipping the refinement for them works fine. Also, a few remaining issues in the MultiGGX code are fixed that were caused by rays parallel to the surface (which happened more often there due to smooth shading).	2016-07-25 16:14:25 +02:00
Brecht Van Lommel	f23fecf306	Fix use of uninitialized variable in recent SSS fix.	2016-07-24 16:40:30 +02:00
Sergey Sharybin	9946cca146	Fix T48860: Cycles SSS artifacts with spatially split BVH The issue was caused by SSS intersection code gathering all intersections without check for duplicated ones. This caused situations when same intersection will be recorded twice in the case if triangle is shared by several BVH nodes. Usually this is handled by checking intersection distance after sorting intersections (in shadow_blocked for example) but for SSS we don't do such sorting and using number of intersections to calculate various things. Didn't find anything smarter than to check intersection distance in triangle_intersect_subsurface(). This solves render artifacts in the cost of 1.5% slowdown of extreme case rendering (SSS object filling in whole FullHD screen). Reviewers: brecht Reviewed By: brecht Differential Revision: https://developer.blender.org/D2105	2016-07-18 10:04:20 +02:00
Sergey Sharybin	17e7454263	Cycles: Reduce memory usage by de-duplicating triangle storage There are several internal changes for this: First idea is to make __tri_verts to behave similar to __tri_storage, meaning, __tri_verts array now contains all vertices of all triangles instead of just mesh vertices. This saves some lookup when reading triangle coordinates in functions like triangle_normal(). In order to make it efficient needed to store global triangle offset somewhere. So no __tri_vindex.w contains a global triangle index which can be used to read triangle vertices. Additionally, the order of vertices in that array is aligned with primitives from BVH. This is needed to keep cache as much coherent as possible for BVH traversal. This causes some extra tricks needed to fill the array in and deal with True Displacement but those trickery is fully required to prevent noticeable slowdown. Next idea was to use this __tri_verts instead of __tri_storage in intersection code. Unfortunately, this is quite tricky to do without noticeable speed loss. Mainly this loss is caused by extra lookup happening to access vertex coordinate. Fortunately, tricks here and there (i,e, some types changes to avoid casts which are not really coming for free) reduces those losses to an acceptable level. So now they are within couple of percent only, On a positive site we've achieved: - Few percent of memory save with triangle-only scenes. Actual save in this case is close to size of all vertices. On a more fine-subdivided scenes this benefit might become more obvious. - Huge memory save of hairy scenes. For example, on koro.blend there is about 20% memory save. Similar figure for bunny.blend. This memory save was the main goal of this commit to move forward with Hair BVH which required more memory per BVH node. So while this sounds exciting, this memory optimization will become invisible by upcoming Hair BVH work. But again on a positive side, we can add an option to NOT use Hair BVH and then we'll have same-ish render times as we've got currently but will have this 20% memory benefit on hairy scenes.	2016-07-07 17:25:48 +02:00
Sergey Sharybin	b595a692c8	Cycles: Limit degenerated triangle check got CUDA only OpenCL seems to work fine here, and for some reason that comparison was giving compilation error on OpenCL here. Better to compile OpenCL kernel than to be fully robust to weird corner cases.	2016-06-07 15:48:56 +02:00
Sergey Sharybin	f54a98a1c5	Cycles: Simplify check for degenerated faces on GPU Still not sure how to properly solve the issue, needs some trickery to get actual optimized values from intersection function (using printf() avoids some optimization and makes stuff render correct). For the time being let's just simplify check.	2016-06-03 10:36:04 +02:00
Sergey Sharybin	6cd13a221f	Cycles: Rename tri_woop to tri_storage It's no longer a pre-computed data and just a storage of triangle coordinates which are faster to access to.	2016-04-11 17:18:14 +02:00
Sergey Sharybin	700722f686	Cycles: Cleanup, indent nested preprocessor directives Quite straightforward, main trick is happening in path_source_replace_includes(). Reviewers: brecht, dingto, lukasstockner97, juicyfruit Differential Revision: https://developer.blender.org/D1794	2016-03-25 13:55:42 +01:00
Sergey Sharybin	c93069083e	Cycles: Tweaks for 32bit CUDA binaries Tweak some inline policies. Not totally crazy yet, and in fact we now have one less ifdef statement now.	2016-02-15 19:11:02 +01:00
Sergey Sharybin	72e31d6a72	Cycles: Always inline triangle precalc for CUDA devices Since the SSS changes compiling Experimental sm_52 kernel seems to work just fine.	2016-01-11 21:41:00 +05:00
Sergey Sharybin	a6bbf05ba6	Cycles: Fix wrong SSS intersection refinement when this option is disabled The code is disabled by default, but we'd better keep it all correct.	2015-12-02 03:14:54 +05:00
Sergey Sharybin	8bca34fe32	Cysles: Avoid having ShaderData on the stack This commit introduces a SSS-oriented intersection structure which is replacing old logic of having separate arrays for just intersections and shader data and encapsulates all the data needed for SSS evaluation. This giver a huge stack memory saving on GPU. In own experiments it gave 25% memory usage reduction on GTX560Ti (722MB vs. 946MB). Unfortunately, this gave some performance loss of 20% which only happens on GPU. This is perhaps due to different memory access pattern. Will be solved in the future, hopefully. Famous saying: won in memory - lost in time (which is also valid in other way around).	2015-11-25 13:01:22 +05:00
Sergey Sharybin	47b1279762	Cycles: Watertight fix for SSS intersection Same as previous commit, just was missing in there.	2015-10-22 22:10:40 +05:00
Sergey Sharybin	f84cbae43e	Cycles: Fix for watertight intersection It was possible to miss some intersection caused by wrong barycentric coordinates sign. Cases when one of the coordinate is zero and other are negative was not handled correct.	2015-10-22 22:07:28 +05:00
Sergey Sharybin	3cee28ebf3	Fix T46143: Faces missing with GPU render Epsilon was quite arbitrary for GPU, replaced with checking for zero-sized faces. It should solve both original report and the new one. After the release we can check why GPU doesn't produce accurate math here and go to the root of the issue.	2015-09-17 17:21:17 +05:00
Sergey Sharybin	1a04179802	Cycles: Cleanup, typo Spotted by Campbell, thanks!	2015-09-09 14:25:43 +05:00
Sergey Sharybin	d13a0e8f4a	Cycles: Limit triangle magnitude check for only GPU Found a way to make AVX2 CPUs happy by reshuffling instructions a bit, so now there's no weird precision errors happening in there. This solves some render speed regressions on CPU, but unfortunately this doesn't help for GPU rendering.	2015-09-09 13:39:36 +05:00

1 2

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

84 Commits