blender-archive

Archived

Author	SHA1	Message	Date
Brecht Van Lommel	68dd7617d7	Cycles: add utility functions for zero float2/float3/float4/transform Ref D8237, T78710	2021-02-17 16:26:24 +01:00
Patrick Mours	1d149f6746	Fix T72470: OptiX render fails with scene with many translucent planes on Linux. OptiX always uses record-all behavior for transparent shadow rays, but did not check whether the maximum number of hits exceeded the shadow hit stack. This fixes that.	2020-01-10 15:47:51 +01:00
Lukas Stockner	e760972221	Cycles: support for custom shader AOVs Custom render passes are added in the Shader AOVs panel in the view layer settings, with a name and data type. In shader nodes, an AOV Output node is then used to output either a value or color to the pass. Arbitrary names can be used for these passes, as long as they don't conflict with built-in passes that are enabled. The AOV Output node can be used in both material and world shader nodes. Implemented by Lukas, with tweaks by Brecht. Differential Revision: https://developer.blender.org/D4837	2019-12-10 20:44:46 +01:00
Patrick Mours	53932f1f06	Cycles: add Optix support in the kernel This adds all the kernel side changes for the Optix backend. Ref D5363	2019-09-13 11:46:22 +02:00
Patrick Mours	db257e679a	Cycles: remove workaround to pass ray by value CUDA is working correct without it now, and it's more efficient not to do this. Ref D5363	2019-08-26 10:26:53 +02:00
Campbell Barton	cd6b49f995	Cleanup: spelling	2019-07-07 15:38:41 +10:00
Brecht Van Lommel	7a92b8820b	Cycles: remove hair minimum width support. This never really worked as it was supposed to. The main goal of this is to turn noise from sampling tiny hairs into multiple layers of transparency that do not need to be sampled stochastically. However the implementation of this worked by randomly discarding hair intersections in BVH traversal, which defeats the purpose. If it ever comes back, it's best implemented outside the kernel as a preprocess that changes hair radius before BVH building. This would also make it work with Embree, where it's not supported now. But it's not so clear anymore that with many AA samples and GPU rendering this feature is as helpful as it once was for CPU raytracers with few AA samples. The benefit of removing this feature is improved hair ray tracing performance, tested on NVIDIA Titan Xp: bmw27: +0.37% classroom: +0.26% fishy_cat: -7.36% koro: -12.98% pabellon: -0.12% Differential Revision: https://developer.blender.org/D4532	2019-04-24 14:39:47 +02:00
Campbell Barton	e12c08e8d1	ClangFormat: apply to source, most of intern Apply clang format as proposed in T53211. For details on usage and instructions for migrating branches without conflicts, see: https://wiki.blender.org/wiki/Tools/ClangFormat	2019-04-17 06:21:24 +02:00
Sergey Sharybin	cb4b5e12ab	Cycles: Cleanup, spacing after preprocessor It is supposed to be two spaces before comment stating which if else/endif statements corresponds to. Was mainly violated in the header guards.	2018-11-09 11:34:54 +01:00
Brecht Van Lommel	2d81758aa6	Cycles: better path termination for transparency. We now continue transparent paths after diffuse/glossy/transmission/volume bounces are exceeded. This avoids unexpected boundaries in volumes with transparent boundaries. It is also required for MIS to work correctly with transparent surfaces, as we also continue through these in shadow rays. The main visible changes is that volumes will now be lit by the background even at volume bounces 0, same as surfaces. Fixes T53914 and T54103.	2018-02-22 00:55:32 +01:00
Brecht Van Lommel	8a72be7697	Cycles: reduce closure memory usage for emission/shadow shader data. With a Titan Xp, reduces path trace local memory from 1092MB to 840MB. Benchmark performance was within 1% with both RX 480 and Titan Xp. Original patch was implemented by Sergey. Differential Revision: https://developer.blender.org/D2249	2017-11-05 20:48:33 +01:00
Sergey Sharybin	c0480bc972	Cycles: Fix compilation error of OpenCL megakernel on Apple	2017-09-23 17:07:19 +05:00
Brecht Van Lommel	095a01a73a	Cycles: slightly improve BSDF sample stratification for path tracing. Similar to what we did for area lights previously, this should help preserve stratification when using multiple BSDFs in theory. Improvements are not easily noticeable in practice though, because the number of BSDFs is usually low. Still nice to eliminate one sampling dimension.	2017-09-20 19:38:08 +02:00
Hristo Gueorguiev	6798a061b7	Cycles: Fix compilation error with OpenCL split kernel	2017-09-16 12:33:03 +02:00
Sergey Sharybin	467d92b8f1	Cycles: Tweaks to avoid compilation error of megakernel Also moved code out of deep-inside ifdef block, otherwise it was quite confusing.	2017-09-12 13:33:46 +05:00
Sergey Sharybin	750e38a526	Cycles: Fix compilation error with CUDA after recent changes	2017-09-05 16:52:45 +02:00
Sergey Sharybin	f01e43fac3	Fix T52433: Volume Absorption color tint Need to exit the volume stack when shadow ray laves the medium. Thanks Brecht for review and help in troubleshooting!	2017-09-05 15:48:34 +02:00
Sergey Sharybin	b0bbb5f34f	Cycles: Cleanup, style	2017-09-05 12:43:02 +02:00
Brecht Van Lommel	b85d36d811	Code cleanup: remove shader context. This was needed when we accessed OSL closure memory after shader evaluation, which could get overwritten by another shader evaluation. But all closures are immediatley converted to ShaderClosure now, so no longer needed.	2017-08-24 03:43:02 +02:00
Brecht Van Lommel	cfa8b762e2	Code cleanup: move rng into path state. Also pass by value and don't write back now that it is just a hash for seeding and no longer an LCG state. Together this makes CUDA a tiny bit faster in my tests, but mainly simplifies code.	2017-08-19 18:14:16 +02:00
Brecht Van Lommel	fc38276d74	Fix Cycles shadow catcher objects influencing each other. Since all the shadow catchers are already assumed to be in the footage, the shadows they cast on each other are already in the footage too. So don't just let shadow catchers skip self, but all shadow catchers. Another justification is that it should not matter if the shadow catcher is modeled as one object or multiple separate objects, the resulting render should be the same. Differential Revision: https://developer.blender.org/D2763	2017-08-07 17:54:26 +02:00
Sergey Sharybin	f970e859cf	Cycles: Cleanup, style	2017-04-18 11:39:21 +02:00
Mai Lavelle	8f85ee2fc9	Cycles: Fix indentation	2017-04-07 06:06:08 -04:00
Sergey Sharybin	d14e39622a	Cycles: First implementation of shadow catcher It uses an idea of accumulating all possible light reachable across the light path (without taking shadow blocked into account) and accumulating total shaded light across the path. Dividing second figure by first one seems to be giving good estimate of the shadow. In fact, to my knowledge, it's something really similar to what is happening in the denoising branch, so we are aligned here which is good. The workflow is following: - Create an object which matches real-life object on which shadow is to be catched. - Create approximate similar material on that object. This is needed to make indirect light properly affecting CG objects in the scene. - Mark object as Shadow Catcher in the Object properties. Ideally, after doing that it will be possible to render the image and simply alpha-over it on top of real footage.	2017-03-27 10:46:03 +02:00
Hristo Gueorguiev	e8b5a5bf5b	Cycles: Speedup transparent shadows in split kernel This commit enables record-all transparent shadows rays. Perfromance results: R9 290 render time (without synchronization), seconds Before After Change BMW 261.5 262.5 +0.4 % Classroom 869.6 867.3 -0.3 % Fishy Cat 657.4 639.8 -2.7 % Koro 1909.8 692.8 -63.7 % Pabellon Barcelona 1633.3 1238.0 -24.2 % Pabellon Barcelona() 1158.1 903.8 -22.0 % () without glossy connected to volume	2017-03-09 17:09:37 +01:00
Hristo Gueorguiev	57e26627c4	Cycles: SSS and Volume rendering in split kernel Decoupled ray marching is not supported yet. Transparent shadows are always enabled for volume rendering. Changes in kernel/bvh and kernel/geom are from Sergey. This simiplifies code significantly, and prepares it for record-all transparent shadow function in split kernel.	2017-03-09 17:09:37 +01:00
Mai Lavelle	352ee7c3ef	Cycles: Remove ccl_fetch and SOA	2017-03-08 00:52:41 -05:00
Mai Lavelle	230c00d872	Cycles: OpenCL split kernel refactor This does a few things at once: - Refactors host side split kernel logic into a new device agnostic class `DeviceSplitKernel`. - Removes tile splitting, a new work pool implementation takes its place and allows as many threads as will fit in memory regardless of tile size, which can give performance gains. - Refactors split state buffers into one buffer, as well as reduces the number of arguments passed to kernels. Means there's less code to deal with overall. - Moves kernel logic out of OpenCL kernel files so they can later be used by other device types. - Replaced OpenCL specific APIs with new generic versions - Tiles can now be seen updating during rendering	2017-03-08 00:52:41 -05:00
Sergey Sharybin	b16fd22018	Cycles: Fix regression with transparent shadows in volume	2017-02-08 14:00:48 +01:00
Sergey Sharybin	da31a82832	Cycles: Solve speed regression by casting opaque ray first	2017-02-08 14:00:48 +01:00
Sergey Sharybin	04cf1538b5	Cycles: Fix compilation error on OpenCL	2017-02-08 14:00:48 +01:00
Sergey Sharybin	31a025f51e	Cycles: Split shadow functions to avoid some duplicated calculations	2017-02-08 14:00:48 +01:00
Sergey Sharybin	dde40989f3	Cycles: Store shadow intersections in the kernel globals Seems CUDA failed to de-duplicate the array across multiple inlined versions of the shadow_blocked(). Helped it a bit with that now. Gives about 100MB memory improvement on a scenes after previous commit and brings up memory "regression" to only 100MB comparing to the master branch now.	2017-02-08 14:00:48 +01:00
Sergey Sharybin	9830eeb44b	Cycles: Implement record-all transparent shadow function for GPU The idea is to record all possible transparent intersections when shooting transparent ray on GPU (similar to what we were doing on CPU already). This avoids need of doing whole ray-to-scene intersections queries for each intersection and speeds up a lot cases like transparent hair in the cost of extra memory. This commit is a base ground for now and this feature is kept disabled for until some further tweaks.	2017-02-08 14:00:48 +01:00
Sergey Sharybin	9c3d202e56	Cycles: Use an utility function to sort intersections array	2017-02-08 14:00:48 +01:00
Sergey Sharybin	58a10122d0	Cycles: Make GPU version of shadow_blocked() closer to CPU Now we break the traversal cycle and then perform volume attenuation and check with zero throughput. Not sure it makes any measurable sense at this moment, but in the future it might help de-duplicating some extra logic here.	2017-02-08 14:00:48 +01:00
Sergey Sharybin	98a1855803	Cycles: De-duplicate transparent shadows attenuation Fair amount of code was duplicated for CPU and GPU, now we are using inlined function to avoid such duplication.	2017-02-08 14:00:48 +01:00
Brecht Van Lommel	a3abb020e3	Fix Cycles CUDA performance on CUDA 8.0. Mostly this is making inlining match CUDA 7.5 in a few performance critical places. The end result is that performance is now better than before, possibly due to less register spilling or other CUDA 8.0 compiler improvements. On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory usage is reduced a little too. Reviewed By: sergey Differential Revision: https://developer.blender.org/D2269	2016-10-03 22:15:25 +02:00
Sergey Sharybin	166286e6de	Cycles: Make code more uniform across two versions of shadow_blocked() Just to make it easier to research ways of possible code de-duplication.	2016-09-21 11:50:11 +02:00
Sergey Sharybin	e4f7bf6ccb	Cycles: Remove out of date comment	2016-09-21 11:48:36 +02:00
Sergey Sharybin	7030794171	Cycles: Revert previous fixes to intersect_all functions While they prevent legit write past the array boundary error those fixes introduced regression in behavior when having exact max_hits transparent intersections and nothing else. Previous code would have considered such case a totally opaque, but it's not correct. Fixes T48941: Some materials don't get transparent shadows anymore	2016-07-26 17:16:23 +02:00
Sergey Sharybin	3637cbbcf8	Cycles: Fix wrong termination criteria in intersect_all functions It was possible to miss bounces termination criteria in this functions, mainly when max_hits was set to 0. Made the check more robust in traversal functions (which should not affect performance, it's an operation of same complexity AFAIK). Also avoid doing ray-scene intersection from shadow_blocked when limit of transparent bounces was already reached.	2016-07-14 11:26:20 +02:00
Lukas Stockner	23c276832b	Cycles: Add multi-scattering, energy-conserving GGX as an option to the Glossy, Anisotropic and Glass BSDFs This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model". Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until the ray leaves it again, which ensures perfect energy conservation. In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing roughness - is solved in a physically correct and efficient way. The downside of this model is that it has no (known) analytic expression for evalation. However, it can be evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the balance heuristic guarantee an unbiased result at the cost of slightly higher noise. Reviewers: dingto, #cycles, brecht Reviewed By: dingto, #cycles, brecht Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel Differential Revision: https://developer.blender.org/D2002	2016-06-23 22:57:26 +02:00
Sergey Sharybin	42b26206c6	Fix T48508: Cycles Regression / Crash	2016-05-24 14:53:34 +02:00
Brecht Van Lommel	999d5a6785	Cycles CUDA: reduce stack memory by reusing ShaderData. 57% less for path and 48% less for branched path.	2016-05-23 22:29:24 +02:00
Sergey Sharybin	7b356a8565	Cycles: Reduce amount of malloc() calls from the kernel This commit makes it so malloc() is only happening once per volume and once per transparent shadow query (per thread), improving scalability of the code to multiple CPU cores. Hard to measure this with a low-bottom i7 here currently, but from quick tests seems volume sampling gave about 3-5% speedup. The idea is to store allocated memory in kernel globals, which are per thread on CPU already. Reviewers: dingto, juicyfruit, lukasstockner97, maiself, brecht Reviewed By: brecht Subscribers: Blendify, nutel Differential Revision: https://developer.blender.org/D1996	2016-05-18 10:14:24 +02:00
Sergey Sharybin	9815f8a623	Cycles: Cleanup of OpenCL split kernel routines The idea is to switch from allocating separate buffers for shader data's structure of arrays to allocating one huge memory block and do some index trickery to make it accessed as SOA. This saves quite reasonable amount of lines of code in device_opencl and also makes it possible to get rid of special declaration of ShaderData structure. As a side effect it also makes it easier to experiment with SOA vs. AOS for split kernel. Works fine here on NVidia GTX580, Intel CPU amd AMD Fiji cards. Reviewers: #cycles, brecht, juicyfruit, dingto Differential Revision: https://developer.blender.org/D1593	2016-01-30 00:23:06 +01:00
Sergey Sharybin	e2161ca854	Cycles: Remove few function arguments needed only for the split kernel Use KernelGlobals to access all the global arrays for the intermediate storage instead of passing all this storage things explicitly. Tested here with Intel OpenCL, NVIDIA GTX580 and AMD Fiji, didn't see any artifacts, so guess it's all good. Reviewers: juicyfruit, dingto, lukasstockner97 Differential Revision: https://developer.blender.org/D1736	2016-01-28 18:59:27 +01:00
Sergey Sharybin	1f273cec00	Cycles: Tweak inline policy for some functions The goal is to make Experimental kernel closer in performance to the official kernel, avoiding spills and such. There should not be big impact on official kernel, own tests showed few percent performance drop on laptop's GPU. CPU was always the same speed on AVX, AVX2 and SSE4.1 CPUs i've been testing here. This seems to be the last essential step before we can get rid of Experimental kernel and enable SSS officially on GPU without causing some major performance issues. Surely some more tweaks are possibly required, but that we can do for until cows go home anyway.	2016-01-14 14:53:05 +05:00
Thomas Dinges	83e73a2100	Cycles: Refactor how we pass bounce info to light path node. This commit changes the way how we pass bounce information to the Light Path node. Instead of manualy copying the bounces into ShaderData, we now directly pass PathState. This reduces the arguments that we need to pass around and also makes it easier to extend the feature. This commit also exposes the Transmission Bounce Depth to the Light Path node. It works similar to the Transparent Depth Output: Replace a Transmission lightpath after X bounces with another shader, e.g a Diffuse one. This can be used to avoid black surfaces, due to low amount of max bounces. Reviewed by Sergey and Brecht, thanks for some hlp with this. I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split and Mega kernel. Hopefully this covers all devices. :)	2016-01-06 23:43:29 +01:00

1 2

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

71 Commits