blender-archive

Archived

Author	SHA1	Message	Date
Lukas Stockner	fa3d50af95	Cycles: Improve denoising speed on GPUs with small tile sizes Previously, the NLM kernels would be launched once per offset with one thread per pixel. However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown. Therefore, the kernels are now launched in a single call that handles all offsets at once. This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory. On the other hand, of course, the smaller tiles significantly reduce the size of the memory. The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum. I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere. To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now. Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.	2017-11-30 07:37:08 +01:00
Lukas Stockner	d8066fb0f1	Cycles: Refactor closure roughness detection to fix a potential bug with Denoising of specular shaders	2017-11-14 04:17:54 +01:00
Sergey Sharybin	885c0a5f90	Cycles: Fix compilation warning	2017-09-04 13:28:15 +02:00
Sergey Sharybin	436d1b4e90	Cycles: FIx issue with -0 being considered a non-finite value	2017-08-24 14:32:56 +02:00
Brecht Van Lommel	4d428d14af	Fix T52443: Cycles OpenCL build error after recent mesh lights changes.	2017-08-19 01:02:55 +02:00
Brecht Van Lommel	a24fbf3323	Code refactor: add, remove, optimize various SSE functions. * Remove some unnecessary SSE emulation defines. * Use full precision float division so we can enable it. * Add sqrt(), sqr(), fabs(), shuffle variations, mask(). * Optimize reduce_add(), select(). Differential Revision: https://developer.blender.org/D2764	2017-08-07 14:01:24 +02:00
Mai Lavelle	95b345b2fe	Revert "Cycles: use std::min and max for extra overloads" We already have this in util_algorithm.h This reverts commit `cff172c762`.	2017-07-06 04:21:29 -04:00
Mai Lavelle	cff172c762	Cycles: use std::min and max for extra overloads	2017-07-05 19:43:34 -04:00
Ray molenkamp	c9451f1cff	[Cycles] Fix math problems in safe_logf log(0) is undefined and should not have been included log(1) == 0, dividing by zero is not recommended	2017-05-07 09:16:14 -06:00
Lukas Stockner	43b374e8c5	Cycles: Implement denoising option for reducing noise in the rendered image This commit contains the first part of the new Cycles denoising option, which filters the resulting image using information gathered during rendering to get rid of noise while preserving visual features as well as possible. To use the option, enable it in the render layer options. The default settings fit a wide range of scenes, but the user can tweak individual settings to control the tradeoff between a noise-free image, image details, and calculation time. Note that the denoiser may still change in the future and that some features are not implemented yet. The most important missing feature is animation denoising, which uses information from multiple frames at once to produce a flicker-free and smoother result. These features will be added in the future. Finally, thanks to all the people who supported this project: - Google (through the GSoC) and Theory Studios for sponsoring the development - The authors of the papers I used for implementing the denoiser (more details on them will be included in the technical docs) - The other Cycles devs for feedback on the code, especially Sergey for mentoring the GSoC project and Brecht for the code review! - And of course the users who helped with testing, reported bugs and things that could and/or should work better!	2017-05-07 14:40:58 +02:00
Sergey Sharybin	0a07cdbe80	Cycles: Split vectorized math utilities to a dedicated files This file was even a bigger mess than vectorized types header, cleaning it up to make it easier to maintain this files and extend further.	2017-04-25 10:33:26 +02:00
Sergey Sharybin	360cf8393a	Cycles: Make vectorized types constructor from register explicit This is not a cheap operation which we dont' want to happen silently.	2017-04-13 15:08:00 +02:00
Sergey Sharybin	e6392458d3	Cycles: Remove unused function It was quite wrong actually by doing some __m128 to flaot4 round trips.	2017-04-13 15:08:00 +02:00
Sergey Sharybin	0579eaae1f	Cycles: Make all #include statements relative to cycles source directory The idea is to make include statements more explicit and obvious where the file is coming from, additionally reducing chance of wrong header being picked up. For example, it was not obvious whether bvh.h was refferring to builder or traversal, whenter node.h is a generic graph node or a shader node and cases like that. Surely this might look obvious for the active developers, but after some time of not touching the code it becomes less obvious where file is coming from. This was briefly mentioned in T50824 and seems @brecht is fine with such explicitness, but need to agree with all active developers before committing this. Please note that this patch is lacking changes related on GPU/OpenCL support. This will be solved if/when we all agree this is a good idea to move forward. Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner Reviewed By: lukasstockner97, maiself, nirved, dingto Subscribers: brecht Differential Revision: https://developer.blender.org/D2586	2017-03-29 13:41:11 +02:00
Sergey Sharybin	bd053ac7ba	Cycles: Correct ifdef around float3 intrinsics	2017-03-27 16:13:07 +02:00
Sergey Sharybin	5b45715f8a	Cycles: Correct isfinite check used in integrator Use fast-math friendly version of this function. We should probably avoid unsafe fast math, but this is to be done with real care with all the benchmarks properly done. For now comitting much safer fix.	2017-03-24 15:39:33 +01:00
Sergey Sharybin	b797a5ff78	Cycles: Cleanup, move utility function to utility file Was an old TODO, this function is handy for some math utilities as well.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	1c5cceb7af	Cycles: Move intersection math to own header file There are following benefits: - Modifying intersection algorithm will not cause so much re-compilation. - It works around header dependency hell and allows us to use vectorization types much easier in there.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	5c06ff8bb9	Cycles: Cleanup, remove unused function	2017-03-23 17:45:19 +01:00
Brecht Van Lommel	2d3c44389a	Fix OpenCL warnings about doubles on some platforms.	2017-03-11 00:55:23 +01:00
Aaron Carlisle	6d1ac79514	Cleanup: Grey --> Gray	2017-02-27 19:33:57 -05:00
Sergey Sharybin	254fbcdd7b	Cycles: Fix compilation error on with older GCC Hopefully it works on all platforms now.	2017-01-20 11:55:48 +01:00
Sergey Sharybin	78b94902f8	Cycles: Add fast-math safe isnan and isfinite Currently unused, but might become really handy in the future.	2017-01-19 14:51:11 +01:00
Sergey Sharybin	0ac2be7030	Cycles: Disable AVX2 crash workarounds I can no longer reproduce crash with neither of the files where the crash was originally visible. This is something where other changes (light threshold, sampling) had an effect and made code to work as it is supposed to. Could have been optimizator issue or something like that. Let's see if we hit same issue again.	2016-12-02 10:17:05 +01:00
Lukas Stockner	26bf230920	Cycles: Add optional probabilistic termination of light samples based on their expected contribution In scenes with many lights, some of them might have a very small contribution to some pixels, but the shadow rays are traced anyways. To avoid that, this patch adds probabilistic termination to light samples - if the contribution before checking for shadowing is below a user-defined threshold, the sample will be discarded with probability (1 - (contribution / threshold)) and otherwise kept, but weighted more to remain unbiased. This is the same approach that's also used in path termination based on length. Note that the rendering remains unbiased with this option, it just adds a bit of noise - but if the setting is used moderately, the speedup gained easily outweighs the additional noise. Reviewers: #cycles Subscribers: sergey, brecht Differential Revision: https://developer.blender.org/D2217	2016-10-30 11:31:28 +01:00
Lukas Stockner	1272ee455e	Cycles: Implement texture coordinates for Point, Spot and Area Lamps When using the Normal output of the Texture Coordinate node on Point and Spot lamps, the coordinates now depend on the rotation of the lamp. On Area lamps, the Parametric output of the Geometry node now returns UV coordinates on the area lamp. Credit for the Area lamp part goes to Stefan Werner (from D1995).	2016-10-29 19:24:08 +02:00
Sergey Sharybin	f11298692b	Cycles: More workarounds for weird crashes on AVX2 Oh man, is it a compiler bug? Is it something we do stupid? For now more crap to prevent crashes. During the conference will talk to Maxyn about how can we troubleshoot such weird issues.	2016-10-27 12:51:03 +02:00
Sergey Sharybin	7e380ad4c0	Cycles: Another attempt to fix crashes on AVX2 processors Basically don't use rcp() in areas which seems to be critical after second look. Also disabled some multiplication operators, not sure yet why they might be a problem. Tomorrow will be setting up a full test with all cases which were buggy in our farm to see if this fix is complete.	2016-10-26 22:14:41 +02:00
Sergey Sharybin	af411d918e	Cycles: Implement SSE-optimized path of util_max_axis() The idea here is to avoid if statements which could cause wrong branch prediction. Gives a bit of measurable speedup up to ~1%. Still nice :) Inspired by Maxym Dmytrychenko, thanks!	2016-10-25 13:54:17 +02:00
Sergey Sharybin	0ddb8d9b13	Cycles: Disable optimization of operator / for float3 This was giving some speedup but made intersection tests to fail from watertight point of view. Needs deeper investigation, but need to quickly get it fixed for the studio.	2016-10-14 13:53:26 +02:00
Sergey Sharybin	22cdf44101	Cycles: Use const reference for register variables in non-OpenCL code This is something tested by @LazyDodo and suggested by Maxym to make MSVC happier.	2016-10-12 14:48:59 +02:00
Sergey Sharybin	e588106d45	Cycles: Use more SSE intrinsics for float3 type This gives about 5% speedup on AVX2 kernels (other kernels still have SSE disabled for math operations) and this solves the slowdown of koro scene mention in the previous commit. The title says it all actually. This commit also contains changes to pass float3 as const reference in affected functions. This should make MSVC happier without breaking OpenCL because it's only done in areas which are ifdef-ed for non-OpenCL. Another patch based on inspiration from Maxym Dmytrychenko, thanks!	2016-10-12 14:43:00 +02:00
Alexander Gavrilov	a7f6f900f3	Cycles: avoid making NaNs in Vector Math node by normalizing zero vectors. Since inputs are user controlled, the node can't assume they aren't zero.	2016-08-09 13:20:22 +03:00
Sergey Sharybin	6353ecb996	Cycles: Tweaks to support CUDA 8 toolkit All the changes are mainly giving explicit tips on inlining functions, so they match how inlining worked with previous toolkit. This make kernel compiled by CUDA 8 render in average with same speed as previous kernels. Some scenes are somewhat faster, some of them are somewhat slower. But slowdown is within 1% so far. On a positive side it allows us to enable newer generation cards on buildbots (so GTX 10x0 will be officially supported soon).	2016-08-01 15:54:29 +02:00
Brecht Van Lommel	9b6ed3a42b	Cycles: refactor kernel closure storage to use structs per closure type. Reviewed By: dingto, sergey Differential Revision: https://developer.blender.org/D2127	2016-07-31 02:34:43 +02:00
Mai Lavelle	c96ae81160	Cycles microdisplacement: ngons and attributes for subdivision meshes This adds support for ngons and attributes on subdivision meshes. Ngons are needed for proper attribute interpolation as well as correct Catmull-Clark subdivision. Several changes are made to achieve this: - new primitive `SubdFace` added to `Mesh` - 3 more textures are used to store info on patches from subd meshes - Blender export uses loop interface instead of tessface for subd meshes - `Attribute` class is updated with a simplified way to pass primitive counts around and to support ngons. - extra points for ngons are generated for O(1) attribute interpolation - curves are temporally disabled on subd meshes to avoid various bugs with implementation - old unneeded code is removed from `subd/` - various fixes and improvements Reviewed By: brecht Differential Revision: https://developer.blender.org/D2108	2016-07-29 03:36:30 -04:00
Lukas Stockner	23c276832b	Cycles: Add multi-scattering, energy-conserving GGX as an option to the Glossy, Anisotropic and Glass BSDFs This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model". Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until the ray leaves it again, which ensures perfect energy conservation. In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing roughness - is solved in a physically correct and efficient way. The downside of this model is that it has no (known) analytic expression for evalation. However, it can be evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the balance heuristic guarantee an unbiased result at the cost of slightly higher noise. Reviewers: dingto, #cycles, brecht Reviewed By: dingto, #cycles, brecht Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel Differential Revision: https://developer.blender.org/D2002	2016-06-23 22:57:26 +02:00
Lukas Stockner	7a5a02509b	Cycles: Use faster ray-quad-intersection test The original quad intersection test works by just testing against the two triangles that define the quad. However, in this case it's actually faster to use the same test that's also used for portals: Determining the distance to the plane in which the quad lies, calculating the hitpoint and checking whether it's in the quad by projecting onto the sides. Reviewers: brecht, sergey, dingto Reviewed By: dingto Differential Revision: https://developer.blender.org/D2045	2016-06-06 23:38:50 +02:00
Sergey Sharybin	3165e8740b	Fix T48139: Checker texture strange behavior in cycles Seems particular CUDA implementations has some precision issues, which made integer coordinate (which was expected to always be positive) to go negative.	2016-04-15 15:30:30 +02:00
Sergey Sharybin	b30ab24fb8	Cycles: Avoid re-definition of math cnstants with MSVC	2016-02-20 14:06:05 +05:00
Lukas Stockner	6995b4d8d9	Cycles: Adding Hilbert Spiral as a tile order for rendering This patch adds the "Hilbert Spiral", a custom-designed continuous space-filling curve, as a tile order for rendering in Cycles. It essentially works by dividing the tiles into tile blocks which are processed in a spiral outwards from the center. Inside each block, the tiles are processed in a regular Hilbert curve pattern. By rotating that pattern according to the spiral direction, a continuous curve is obtained, which helps with cache coherency and therefore rendering speed. The curve is a compromise between the faster-rendering Bottom-to-Top etc. orders and the Center order, which is a bit slower, but starts with the more important areas. The Hilbert Spiral also starts in the center (unless huge tiles are used) and is still marginally slower than Bottom-to-Top, but noticeably faster than Center. Reviewers: sergey, #cycles, dingto Reviewed By: #cycles, dingto Subscribers: iscream, gregzaal, sergey, mib2berlin Differential Revision: https://developer.blender.org/D1166	2016-01-10 00:13:53 +01:00
Lukas Stockner	8512e284a0	Fix T46906: Cycles syntax error while compiling OpenCL kernels The safe normalization was using a float as a condition, now the intended non-zero test is explicit.	2015-12-01 13:53:29 +01:00
Sergey Sharybin	c18e6fd87c	Cycles: Remove 32bit cuda workaroudn and disable cubins for buildbot Recent changes to kernel broke compilation of the kernels again, need some other kind of solution for this issue. Don't have much time for this currently, but will be addressed before the release. Meanwhile it's better to have some buildbot builds instead of totally failing one.	2015-08-04 18:50:37 +02:00
Sergey Sharybin	7973363e34	Cycles: Final-ish tweaks for 32bit cubin compilation	2015-07-27 16:55:50 +02:00
Sergey Sharybin	61e4800b45	Cycles: One more attempt to fix compilation of 32bit CUDA kernels	2015-07-27 14:18:20 +02:00
Sergey Sharybin	41d817f15d	Fix T44548: Cycles Tube Mapping off / not compatible with BI Was a typo in original implementation, probably a result of some code reshuffle happened for optimization reasons.	2015-04-30 14:27:16 +05:00
Sergey Sharybin	ae7d84dbc1	Cycles: Use native saturate function for CUDA This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1) into a single instruction and uses 4 instructions instead. Original patch by @lockal with own modification: Don't make changes outside of the kernel. They don't make any difference anyway and term saturate() has a bit different meaning outside of kernel. This gives around 2% of speedup in Barcelona file, but in more complex shader setups with lots of math nodes with clamping speedup could be much nicer. Subscribers: dingto Projects: #cycles Differential Revision: https://developer.blender.org/D1224	2015-04-28 00:38:32 +05:00
Sergey Sharybin	5ff132182d	Cycles: Code cleanup, spaces around keywords This inconsistency drove me totally crazy, it's really confusing when it's inconsistent especially when you work on both Cycles and Blender sides. Shouldn;t cause merge PITA, it's whitespace changes only, Git should be able to merge it nicely.	2015-03-28 00:15:15 +05:00
Sergey Sharybin	3e534833e3	Cycles: Make sphere and tube image mapping friendly with OpenCL OpenCL doesn't let you to get address of vector components, which is kinda annoying. On the other hand, maybe now compiler will have more chances to optimize something out.	2015-02-19 12:52:48 +05:00
Sergey Sharybin	bf4c44491a	Cycles: Some more constants fixes for fast math	2015-02-06 15:40:07 +05:00

1 2 3

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

121 Commits