blender-archive

Archived

Author	SHA1	Message	Date
Sergey Sharybin	a1348dde2e	Cycles: Fix speed regression on GPU Avoid construction of temporary array and make utility function force-inlined. Additionally avoid calling float4_to_float3 twice. This brings render times to the same values as before current patch series.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	2a5d7b5b1e	Cycles: Use utility function for SSS triangle intersection This effectively de-duplicates triangle intersection logic implemented for both regular triangle and SSS triangle.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	a5b6742ed2	Cycles: Move watertight triangle intersection to an utility file This way the code can be reused more easily.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	f8a999c965	Cycles: Move triangle intersection precalc to an util file This is a preparation work for the followup commit which wil l move remaining parts of Woop intersection logic to an utility file. Doing it as a separate commit to keep changes more atomic and easier to bisect when/if needed.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	b797a5ff78	Cycles: Cleanup, move utility function to utility file Was an old TODO, this function is handy for some math utilities as well.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	1c5cceb7af	Cycles: Move intersection math to own header file There are following benefits: - Modifying intersection algorithm will not cause so much re-compilation. - It works around header dependency hell and allows us to use vectorization types much easier in there.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	e8ff06186e	Cycles: Cleanup, inline AVX register construction from kernel global data Currently should be no functional changes, preparing for some upcoming refactor.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	2b44db4cfc	Fix/workaround T50533: Transparency shader doesn't cast shadows with curve segments There seems to be a compiler bug of MSVC2013. The issue does not happen on Linux and does not happen on Windows when building with MSVC2015. Since it's reallly a pain to debug release builds with MSVC2013 the AVX2 optimization is disabled for curve sergemnts for this compiler.	2017-03-22 11:37:23 +01:00
Mai Lavelle	8fff6cc2f5	Cycles: Fix building of OpenCL kernels Theres no overloading of functions in OpenCL so we can't make use of `safe_normalize` with `float2`.	2017-03-20 22:55:52 -04:00
Sergey Sharybin	a201b99c5a	Fix T50975: Cycles: Light sampling threshold inadvertently clamps negative lamps	2017-03-20 14:48:55 +01:00
Sergey Sharybin	18bf900b31	Fix T50990: Random black pixels in Cycles when rendering material with Multiscatter GGX	2017-03-20 12:07:41 +01:00
Sergey Sharybin	d6b4fb6429	Cycles: Fix mistake in previous split kernel commits Own stupid mistake. Reported by nirved in IRC, thanks!	2017-03-17 11:55:59 +01:00
Sergey Sharybin	a58350b07f	Cycles: Cleanup, indentation	2017-03-17 10:25:37 +01:00
Sergey Sharybin	e361adbca2	Cycles: Fix compilation error of LCG RNG	2017-03-17 09:58:08 +01:00
Mai Lavelle	60a344b43d	Cycles: Fix handling of barriers	2017-03-17 01:54:04 -04:00
Sergey Sharybin	1cad64900e	Cycles: Define ccl_local variables in kernel functions Declaring ccl_local in a device function is not supported by certain compilers.	2017-03-16 11:27:17 +01:00
Sergey Sharybin	1ff753baa4	Cycles: Workaround for compilation error caused by passing KernelGlobals Pass globals as a bare pointer, same as it sued to be prior to split kernel rework. AMD CPU platform and Intel OpenCL were complaining about this. Perhaps we shouldn't pass globals as pointer at all, this isn't something what is really portable and can cause issues on 32 bit perhaps.	2017-03-16 11:27:17 +01:00
Sergey Sharybin	26620f3f87	Cycles: Avoid some ccl_local in various kernels	2017-03-16 11:27:17 +01:00
Mai Lavelle	8dd0355c21	Cycles: Try to avoid infinite loops by catching invalid ray states	2017-03-14 06:22:57 -04:00
Sergey Sharybin	76acaefdd7	Cycles: Cleanup, wipe obviously outdated parts of split kernel comments	2017-03-13 17:16:16 +01:00
Ray molenkamp	0c72008592	fix msvc warnings about unknown opencl pragmas	2017-03-13 10:08:14 -06:00
Sergey Sharybin	aa36c73c33	Cycles: Add missing header in the file	2017-03-13 16:59:09 +01:00
Hristo Gueorguiev	f169ff8b88	Fix T50925: Add AO approximation to split kernel	2017-03-13 11:15:58 +01:00
Sergey Sharybin	8794a43b68	Cycles: Make MESA compiler more happy While this compiler is not officially supported yet, getting it to work is a nice thing because more and more AMD cards will fall under MESA driver. It's also nice to use explicit comparison with NULL, which makes it more clear whether variable is a boolean or pointer. Even Rust enforces this! Patch by Ian Bruce with own modifications.	2017-03-13 09:57:25 +01:00
Mai Lavelle	96868a3941	Fix T50888: Numeric overflow in split kernel state buffer size calculation Overflow led to the state buffer being too small and the split kernel to get stuck doing nothing forever.	2017-03-11 05:39:28 -05:00
Sergey Sharybin	59fd21296a	Cycles: Cleanup, extra semicolon and space	2017-03-10 15:38:30 +01:00
Mai Lavelle	4a2cde3f0e	Cycles: Enable SSS and volumes for CUDA and Nvidia OpenCL split kernel	2017-03-10 02:09:41 -05:00
Hristo Gueorguiev	9de9f25b24	Cycles: add single program debug option for split kernel Single program generally compiles kernels faster (2-3 times), loads faster, takes less drive space (2-3 times), and reduces the number of cached kernels.	2017-03-09 17:09:37 +01:00
Hristo Gueorguiev	06c051363b	Cycles: split kernel_shadow_blocked to AO & DL parts Reduces memory allocation for split kernel. This allows for faster rendering due to bigger global size, specially when GPU memory is limited. Perfromance results: R9 290 total render time Before After Change BMW 4:37 4:34 -1.1 % Classroom 14:43 14:30 -1.5 % Fishy Cat 11:20 11:04 -2.4 % Koro 12:11 12:04 -1.0 % Pabellon Barcelona 22:01 20:44 -5.8 % Pabellon Barcelona() 15:32 15:09 -2.5 % () without glossy connected to volume	2017-03-09 17:09:37 +01:00
Hristo Gueorguiev	e8b5a5bf5b	Cycles: Speedup transparent shadows in split kernel This commit enables record-all transparent shadows rays. Perfromance results: R9 290 render time (without synchronization), seconds Before After Change BMW 261.5 262.5 +0.4 % Classroom 869.6 867.3 -0.3 % Fishy Cat 657.4 639.8 -2.7 % Koro 1909.8 692.8 -63.7 % Pabellon Barcelona 1633.3 1238.0 -24.2 % Pabellon Barcelona() 1158.1 903.8 -22.0 % () without glossy connected to volume	2017-03-09 17:09:37 +01:00
Hristo Gueorguiev	57e26627c4	Cycles: SSS and Volume rendering in split kernel Decoupled ray marching is not supported yet. Transparent shadows are always enabled for volume rendering. Changes in kernel/bvh and kernel/geom are from Sergey. This simiplifies code significantly, and prepares it for record-all transparent shadow function in split kernel.	2017-03-09 17:09:37 +01:00
Mai Lavelle	c837bd5ea5	Cycles: Fix CUDA build error for some compilers Needed to include `util_types.h` before using `uint`.	2017-03-08 16:44:43 -05:00
Sergey Sharybin	712f7c3640	Cycles: Make it possible to access KernelGlobals from split data initialization function	2017-03-08 11:02:54 +01:00
Sergey Sharybin	ef7c36f5ed	Cycles: Cleanup, remove residue of previous split kernel data This is all in split data state array.	2017-03-08 10:26:29 +01:00
Mai Lavelle	64751552f7	Cycles: Fix indentation	2017-03-08 01:31:32 -05:00
Mai Lavelle	fe7cc94dfa	Cycles: Fix strict warning about unused variable	2017-03-08 01:31:32 -05:00
Mai Lavelle	306034790f	Cycles: Calculate size of split state buffer kernel side By calculating the size of the state buffer in the kernel rather than the host less code is needed and the size actually reflects the requested features. Will also be a little faster in some cases because of larger global work size.	2017-03-08 01:31:30 -05:00
Mai Lavelle	223f45818e	Cycles: Initialize rng_state for split kernel Because the split kernel can render multiple samples in parallel it is necessary to have everything initialized before rendering of any samples begins. The code that normally handles initialization of `rng_state` (`kernel_path_trace_setup()`) only does so for the first sample, which was causing artifacts in the split kernel due to uninitialized `rng_state` for some samples. Note that because the split kernel can render samples in parallel this means that the split kernel is incompatible with the LCG.	2017-03-08 01:31:09 -05:00
Mai Lavelle	cd7d5669d1	Cycles: Remove sum_all_radiance kernel This was only needed for the previous implementation of parallel samples. As we don't have that any more it can be removed. Real reason for removal tho is this: `per_sample_output_buffers` was being calculated too small and artifacts resulted. The tile buffer is already the correct size and calculating the size for `per_sample_output_buffers` is a bit difficult with the current layout of the code. As `per_sample_output_buffers` was only needed for `sum_all_radiance`, removing that kernel and writing output to the tile buffer directly fixes the artifacts.	2017-03-08 01:31:07 -05:00
Mai Lavelle	4cf501b835	Cycles: Split path initialization into own kernel This makes it easier to initialize things correctly in the data_init kernel before they are needed by path tracing.	2017-03-08 01:30:43 -05:00
Mai Lavelle	817873cc83	Cycles: CUDA implementation of split kernel	2017-03-08 01:24:53 -05:00
Mai Lavelle	0892352bfe	Cycles: CPU implementation of split kernel	2017-03-08 00:52:41 -05:00
Mai Lavelle	352ee7c3ef	Cycles: Remove ccl_fetch and SOA	2017-03-08 00:52:41 -05:00
Mai Lavelle	230c00d872	Cycles: OpenCL split kernel refactor This does a few things at once: - Refactors host side split kernel logic into a new device agnostic class `DeviceSplitKernel`. - Removes tile splitting, a new work pool implementation takes its place and allows as many threads as will fit in memory regardless of tile size, which can give performance gains. - Refactors split state buffers into one buffer, as well as reduces the number of arguments passed to kernels. Means there's less code to deal with overall. - Moves kernel logic out of OpenCL kernel files so they can later be used by other device types. - Replaced OpenCL specific APIs with new generic versions - Tiles can now be seen updating during rendering	2017-03-08 00:52:41 -05:00
Mai Lavelle	520b53364c	Cycles: Add OpenCL kernel for zeroing memory buffers Transferring memory to the device was very slow and there's really no need when only zeroing a buffer.	2017-03-08 00:52:41 -05:00
Sergey Sharybin	87f236cd10	Cycles: Fix division by zero in volume code which was producing -nan	2017-02-28 17:33:06 +01:00
Brecht Van Lommel	8c5826f59a	Fix T50698: Cycles baking artifacts with transparent surfaces.	2017-02-25 03:12:53 +01:00
Mai Lavelle	4e9b17da4c	Cycles: Speedup by avoiding extra calculations in noise texture when unneeded Noise texture is now faster when the color socket is unused. Potential for speedup spotted by @nutel. Some performance results: Render Time Before After Difference Gooseberry benchmark 47:51.34 45:55.57 -4% Koro 12:24.92 12:18.46 -0.8% Simple cube (Color socket) 48.53 48.72 +0.3% Simple cube (Fac socket) 48.74 32.78 -32.7% Goethe displacement 1:21.18 1:08.47 -15.6% Cycles brick displacement 3:02.38 2:16.76 -25.0% Large displacement scene 23:54.12 20:09.62 -15.6% Reviewed By: sergey Differential Revision: https://developer.blender.org/D2513	2017-02-21 07:24:33 -05:00
Sergey Sharybin	fe47163a1e	Cycles: Fix CUDA compilation error after recent changes	2017-02-15 15:01:08 +01:00
Sergey Sharybin	8b8c0d0049	Cycles: Don't calculate primitive time if BVH motion steps are not used Solves memory regression by the default configuration.	2017-02-15 12:59:31 +01:00

... 5 6 7 8 9 ...

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

1973 Commits