archive/blender-archive - blender-archive - Blender Projects

archive/blender-archive

Archived

Author	SHA1	Message	Date
Mai Lavelle	470b4cb62f	Cycles: Fix crash with split branched path tracing ShaderData memory was getting clobbered in the branched path code paths. Was caused by `087331c495`	2017-11-16 04:59:31 -05:00
Mai Lavelle	087331c495	Cycles: Replace __MAX_CLOSURE__ build option with runtime integrator variable Goal is to reduce OpenCL kernel recompilations. Currently viewport renders are still set to use 64 closures as this seems to be faster and we don't want to cause a performance regression there. Needs to be investigated. Reviewed By: brecht Differential Revision: https://developer.blender.org/D2775	2017-11-09 01:04:06 -05:00
Brecht Van Lommel	f79f386731	Code refactor: rename subsurface to local traversal, for reuse.	2017-11-07 22:35:12 +01:00
Brecht Van Lommel	8a72be7697	Cycles: reduce closure memory usage for emission/shadow shader data. With a Titan Xp, reduces path trace local memory from 1092MB to 840MB. Benchmark performance was within 1% with both RX 480 and Titan Xp. Original patch was implemented by Sergey. Differential Revision: https://developer.blender.org/D2249	2017-11-05 20:48:33 +01:00
Brecht Van Lommel	095a01a73a	Cycles: slightly improve BSDF sample stratification for path tracing. Similar to what we did for area lights previously, this should help preserve stratification when using multiple BSDFs in theory. Improvements are not easily noticeable in practice though, because the number of BSDFs is usually low. Still nice to eliminate one sampling dimension.	2017-09-20 19:38:08 +02:00
Brecht Van Lommel	b3afc8917c	Code cleanup: refactor BSSRDF closure sampling, for next commit.	2017-09-20 19:38:08 +02:00
Brecht Van Lommel	1d1ddd48db	Fix T52470: cycles OpenCL hair rendering not working after recent changes.	2017-08-20 23:32:20 +02:00
Brecht Van Lommel	cfa8b762e2	Code cleanup: move rng into path state. Also pass by value and don't write back now that it is just a hash for seeding and no longer an LCG state. Together this makes CUDA a tiny bit faster in my tests, but mainly simplifies code.	2017-08-19 18:14:16 +02:00
Sergey Sharybin	40c04dd649	Cycles: Cleanup, indentation	2017-06-13 10:28:38 +02:00
Mai Lavelle	6238214159	Cycles: Faster split branched path tracing by sharing samples with inactive threads Unlike regular path tracing, branched path tracing is usually used with lower sample counts, at least for primary rays. This means that are less samples for the GPU to work on in parallel and rendering is slower. As there is less work overall there is also more inactive threads during rendering with BPT. This patch makes use of those inactive rays to render branched samples in parallel with other samples. Each thread that is preparing for a branched sample will attempt to find an inactive thread and if one is found the state for the sample is copied to that thread. Potentially, if there are enough inactive threads, 100s of branched samples could be generated from the same originating thread and ran in parallel giving large speed ups. Gives 70% faster render for pavillion midday scene. 20-60% faster on BMW with car paint replaced with SSS/volumes.	2017-06-10 04:08:49 -04:00
Sergey Sharybin	2eb906e1b4	Cycles: Fix access array index of -1 in SSS and volume split kernels	2017-05-05 17:54:03 +02:00
Sergey Sharybin	850bb7a50b	Cycles: Cleanup, indentation	2017-05-05 16:54:37 +02:00
Mai Lavelle	915766f42d	Cycles: Branched path tracing for the split kernel This implements branched path tracing for the split kernel. General approach is to store the ray state at a branch point, trace the branched ray as normal, then restore the state as necessary before iterating to the next part of the path. A state machine is used to advance the indirect loop state, which avoids the need to add any new kernels. Each iteration the state machine recreates as much state as possible from the stored ray to keep overall storage down. Its kind of hard to keep all the different integration loops in sync, so this needs lots of testing to make sure everything is working correctly. We should probably start trying to deduplicate the integration loops more now. Nonbranched BMW is ~2% slower, while classroom is ~2% faster, other scenes could use more testing still. Reviewers: sergey, nirved Reviewed By: nirved Subscribers: Blendify, bliblubli Differential Revision: https://developer.blender.org/D2611	2017-05-02 14:26:46 -04:00
Hristo Gueorguiev	8ada7f7397	Cycles: Remove ccl_addr_space from RNG passed to functions Simplifies code quite a bit, making it shorter and easier to extend. Currently no functional changes for users, but is required for the upcoming work of shadow catcher support with OpenCL.	2017-03-27 10:46:28 +02:00
Mai Lavelle	60a344b43d	Cycles: Fix handling of barriers	2017-03-17 01:54:04 -04:00
Sergey Sharybin	1cad64900e	Cycles: Define ccl_local variables in kernel functions Declaring ccl_local in a device function is not supported by certain compilers.	2017-03-16 11:27:17 +01:00
Sergey Sharybin	aa36c73c33	Cycles: Add missing header in the file	2017-03-13 16:59:09 +01:00
Hristo Gueorguiev	57e26627c4	Cycles: SSS and Volume rendering in split kernel Decoupled ray marching is not supported yet. Transparent shadows are always enabled for volume rendering. Changes in kernel/bvh and kernel/geom are from Sergey. This simiplifies code significantly, and prepares it for record-all transparent shadow function in split kernel.	2017-03-09 17:09:37 +01:00