blender-archive

Archived

Author	SHA1	Message	Date
Patrick Mours	fd9124ed6b	Fix Cycles volume render differences with NanoVDB when using linear sampling The NanoVDB sampling implementation behaves different from dense texture sampling, so this adds a small offset to the voxel indices to correct for that. Also removes the need to modify the sampling coordinates by moving all the necessary transformations into the image transform. See also T81454.	2020-11-04 15:09:06 +01:00
Patrick Mours	3df90de6c2	Cycles: Add NanoVDB support for rendering volumes NanoVDB is a platform-independent sparse volume data structure that makes it possible to use OpenVDB volumes on the GPU. This patch uses it for volume rendering in Cycles, replacing the previous usage of dense 3D textures. Since it has a big impact on memory usage and performance and changes the OpenVDB branch used for the rest of Blender as well, this is not enabled by default yet, which will happen only after 2.82 was branched off. To enable it, build both dependencies and Blender itself with the "WITH_NANOVDB" CMake option. Reviewed By: brecht Differential Revision: https://developer.blender.org/D8794	2020-10-05 15:03:30 +02:00
Brecht Van Lommel	60e5ebdd62	Fix Cycles CUDA kernels for Ampere not building with CUDA 11 Running Blender on Ampere cards was already possible with ptx, this fix is needed to support building CUDA binaries. Note the CUDA version used for official Blender builds is still 10, this is merely the change to make it possible for those using CUDA 11 and specifying the sm_8x kernels to be compiled. Found by Milan Jaros.	2020-09-30 18:29:42 +02:00
Patrick Mours	49c295813b	Merge branch 'blender-v2.83-release'	2020-05-27 15:31:03 +02:00
Patrick Mours	28d9368538	Fix T76947: Optix realtime denoiser progressively reduces brightness of very bright objects The input data to the OptiX denoiser was clamped to 0..10000 as required, but it could easily exceed that range with a high number of samples (since the data contains the overall sum). To fix that, divide by the number of samples first and multiply it back in after the denoiser ran.	2020-05-27 15:17:47 +02:00
Brecht Van Lommel	d9773edaa3	Cycles: code refactor to bake using regular render session and tiles There should be no user visible change from this, except that tile size now affects performance. The goal here is to simplify bake denoising in D3099, letting it reuse more denoising tiles and pass code. A lot of code is now shared with regular rendering, with the two main differences being that we read some render result passes from the bake API when starting to render a tile, and call the bake kernel instead of the path trace kernel. With this kind of design where Cycles asks for tiles from the bake API, it should eventually be easier to reduce memory usage, show tiles as they are baked, or bake multiple passes at once, though there's still quite some work needed for that. Reviewers: #cycles Subscribers: monio, wmatyjewicz, lukasstockner97, michaelknubben Differential Revision: https://developer.blender.org/D3108	2020-05-15 20:25:24 +02:00
Brecht Van Lommel	006025ead0	Cycles: support for different 3D transform per volume grid This is not yet fully supported by automatic volume bounds but works fine in most cases that will have mostly matching bounds. Ref T73201	2020-03-18 11:23:05 +01:00
Brecht Van Lommel	f01bc597a8	Cleanup: stop encoding image data type in slot index This is legacy code from when we had a fixed number of textures.	2020-03-11 17:07:17 +01:00
Stefan Werner	51e898324d	Adaptive Sampling for Cycles. This feature takes some inspiration from "RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and "A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination" The basic principle is as follows: While samples are being added to a pixel, the adaptive sampler writes half of the samples to a separate buffer. This gives it two separate estimates of the same pixel, and by comparing their difference it estimates convergence. Once convergence drops below a given threshold, the pixel is considered done. When a pixel has not converged yet and needs more samples than the minimum, its immediate neighbors are also set to take more samples. This is done in order to more reliably detect sharp features such as caustics. A 3x3 box filter that is run periodically over the tile buffer is used for that purpose. After a tile has finished rendering, the values of all passes are scaled as if they were rendered with the full number of samples. This way, any code operating on these buffers, for example the denoiser, does not need to be changed for per-pixel sample counts. Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4686	2020-03-05 12:21:38 +01:00
Patrick Mours	38589de10c	Cycles: Add support for denoising in the viewport The OptiX denoiser can be a great help when rendering in the viewport, since it is really fast and needs few samples to produce convincing results. This patch therefore adds support for using any Cycles denoiser in the viewport also (but only the OptiX one is selectable because the NLM one is too slow to be usable currently). It also adds support for denoising on a different device than rendering (so one can e.g. render with the CPU but denoise with OptiX). Reviewed By: #cycles, brecht Differential Revision: https://developer.blender.org/D6554	2020-02-11 18:03:43 +01:00
Patrick Mours	d5ca72191c	Cycles: Add OptiX AI denoiser support This patch adds support for the OptiX denoiser as an alternative to the existing NLM denoiser in Cycles. It's re-using the same denoising architecture based on tiles and therefore implicitly also works with multiple GPUs. Reviewed By: sergey Differential Revision: https://developer.blender.org/D6395	2020-01-08 16:53:11 +01:00
Campbell Barton	c47d669f24	Cleanup: comments (long lines) in cycles	2019-05-01 21:41:07 +10:00
Campbell Barton	e12c08e8d1	ClangFormat: apply to source, most of intern Apply clang format as proposed in T53211. For details on usage and instructions for migrating branches without conflicts, see: https://wiki.blender.org/wiki/Tools/ClangFormat	2019-04-17 06:21:24 +02:00
Lukas Stockner	fccf506ed7	Cycles: animation denoising support in the kernel. This is the internal implementation, not available from the API or interface yet. The algorithm takes into account past and future frames, both to get more coherent animation and reduce noise. Ref D3889.	2019-02-06 15:18:42 +01:00
Lukas Stockner	405cacd4cd	Cycles: prefilter feature passes separate from denoising. Prefiltering of feature passes will happen during rendering, which can then be used for denoising immediately or written as a render pass for later (animation) denoising. The number of denoising data passes written is reduced because of this, leaving out the feature variance passes. The passes are now Normal, Albedo, Depth, Shadowing, Variance and Intensity. Ref D3889.	2019-02-06 15:18:29 +01:00
Brecht Van Lommel	b14ec18601	Cycles: add initial CUDA 10.0 support, but only recommend use for Turing cards. There may still be rendering errors when used for older graphics cards.	2018-12-04 16:03:18 +01:00
Stefan Werner	e58c6cf0c6	Cycles: Added Cryptomatte output. This allows for extra output passes that encode automatic object and material masks for the entire scene. It is an implementation of the Cryptomatte standard as introduced by Psyop. A good future extension would be to add a manifest to the export and to do plenty of testing to ensure that it is fully compatible with other renderers and compositing programs that use Cryptomatte. Internally, it adds the ability for Cycles to have several passes of the same type that are distinguished by their name. Differential Revision: https://developer.blender.org/D3538	2018-10-28 05:37:41 -04:00
Lukas Stockner	15e9d80375	Cycles: Use existing shared temporary memory in reconstruction step of the denoiser Previously the code allocated its own temporary memory, but it's possible to just use the existing shared one instead.	2018-10-08 22:13:40 +02:00
Campbell Barton	1daa20ad9f	Cleanup: strip trailing space for cycles	2018-07-06 10:17:58 +02:00
Stefan Werner	4d00e95ee3	Cycles: Adding native support for UINT16 textures. Textures in 16 bit integer format are sometimes used for displacement, bump and normal maps and can be exported by tools like Substance Painter. Without this patch, Cycles would promote those textures to single precision floating point, causing them to take up twice as much memory as needed. Reviewers: #cycles, brecht, sergey Reviewed By: #cycles, brecht, sergey Subscribers: sergey, dingto, #cycles Tags: #cycles Differential Revision: https://developer.blender.org/D3523	2018-07-05 13:53:34 +02:00
Lukas Stockner	9db8bdbc65	Cycles Denoising: Cleanup: Rename tiles to tile_info	2018-07-04 14:37:24 +02:00
Lukas Stockner	3ee606621c	Cycles: Query XYZ to/from Scene Linear conversion from OCIO instead of assuming sRGB I've limited it to just the RGB<->XYZ stuff for now, correct image handling is the next step. Reviewers: brecht, sergey Differential Revision: https://developer.blender.org/D3478	2018-06-14 22:21:37 +02:00
Thomas Dinges	9e717c0495	Cycles: Remove Fermi texture code. This should be the last Fermi removal commit, unless I missed something. It's been a pleasure Fermi!	2018-02-17 22:56:58 +01:00
Thomas Dinges	e1ef902058	Cycles: Remove fermi related defines from the code. Did not touch Texture related defines, that comes next.	2018-02-17 22:19:54 +01:00
Lukas Stockner	fa3d50af95	Cycles: Improve denoising speed on GPUs with small tile sizes Previously, the NLM kernels would be launched once per offset with one thread per pixel. However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown. Therefore, the kernels are now launched in a single call that handles all offsets at once. This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory. On the other hand, of course, the smaller tiles significantly reduce the size of the memory. The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum. I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere. To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now. Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.	2017-11-30 07:37:08 +01:00
Stefan Werner	58a15b2bfe	Cycles: Fixed compilation of CUDA kernels. Follow-up fix for my last commit.	2017-11-21 10:43:40 +01:00
Stefan Werner	1febc85855	Cycles: Workaround for performance loss with the CUDA 9.0 SDK. CUDA 9.0.176 apparently caused some slow down on high-end Pascal cards that can be mitigated by increasing the number of registers. See https://developer.blender.org/F1142667 for a detailed comparison.	2017-11-21 10:29:11 +01:00
Brecht Van Lommel	2e50add164	Fix OpenCL performance regression after cubic interpolation. Reorganize code to reduce register pressure.	2017-10-15 17:46:50 +02:00
Sergey Sharybin	8d73ba58b6	Cycles: Fix compilation of sm_20 and sm_21 kernels Was broken since the bicubic commit for GPU support.	2017-10-10 12:26:02 +05:00
Brecht Van Lommel	2d92988f6b	Cycles: CUDA bicubic and tricubic texture interpolation support. While cubic interpolation is quite expensive on the CPU compared to linear interpolation, the difference on the GPU is quite small.	2017-10-07 15:30:57 +02:00
Brecht Van Lommel	23098cda99	Code refactor: make texture code more consistent between devices. * Use common TextureInfo struct for all devices, except CUDA fermi. * Move image sampling code to kernels//kernel__image.h files. * Use arrays for data textures on Fermi too, so device_vector<Struct> works.	2017-10-07 14:53:14 +02:00
Brecht Van Lommel	fb99ea79f8	Code refactor: split displace/background into separate kernels, remove luma.	2017-10-05 17:57:58 +02:00
Brecht Van Lommel	6da6f8d33f	Cycles: CUDA faster rendering of small tiles, using multiple samples like OpenCL. The work size is still very conservative, and this doesn't help for progressive refine. For that we will need to render multiple tiles at the same time. But this should already help for denoising renders that require too much memory with big tiles, and just generally soften the performance dropoff with small tiles. Differential Revision: https://developer.blender.org/D2856	2017-10-04 21:58:47 +02:00
Brecht Van Lommel	12f4538205	Code refactor: use split variance calculation for mega kernels too. There is no significant difference in denoised benchmark scenes and denoising ctests, so might as well make it all consistent.	2017-10-04 21:11:14 +02:00
Brecht Van Lommel	e3e16cecc4	Code refactor: remove rng_state buffer and compute hash on the fly. A little faster on some benchmark scenes, a little slower on others, seems about performance neutral on average and saves a little memory.	2017-10-04 21:11:14 +02:00
Brecht Van Lommel	5b7d6ea54b	Code refactor: add WorkTile struct for passing work to kernel. This makes sharing some code between mega/split in following commits a bit easier, and also paves the way for rendering multiple tiles later.	2017-10-04 21:11:14 +02:00
Brecht Van Lommel	45dcd20ca9	Cycles: CUDA split performance tweaks, still far from megakernel. On Pabellon, 25.8s mega, 35.4s split before, 32.7s split after.	2017-08-05 14:32:59 +02:00
Mai Lavelle	ea846a4dfc	Cycles: Add kernel to enqueue inactive rays The queue will be used to make reuse of inactive threads to keep the GPU more busy.	2017-06-10 03:51:18 -04:00
Lukas Stockner	705c43be0b	Cycles Denoising: Merge outlier heuristic and confidence interval test The previous outlier heuristic only checked whether the pixel is more than twice as bright compared to the 75% quantile of the 5x5 neighborhood. While this detected fireflies robustly, it also incorrectly marked a lot of legitimate small highlights as outliers and filtered them away. This commit adds an additional condition for marking a pixel as a firefly: In addition to being above the reference brightness, the lower end of the 3-sigma confidence interval has to be below it. Since the lower end approximates how low the true value of the pixel might be, this test separates pixels that are supposed to be very bright from pixels that are very bright due to random fireflies. Also, since there is now a reliable outlier filter as a preprocessing step, the additional confidence interval test in the reconstruction kernel is no longer needed.	2017-06-09 03:46:11 +02:00
Sergey Sharybin	90a62404cb	Cycles: Cleanup, variable names Don't use camel case for variable names. Leave that for the structures.	2017-05-19 12:52:12 +02:00
Sergey Sharybin	de86da521c	Cycles: Cleanup, braces after function definition I wouldn't mind switching fully to Google style, but i am against of mixing two different styles in same project. So just stick to brace at the new line after function definition.	2017-05-19 12:43:26 +02:00
Sergey Sharybin	803337f3f6	\0;115;0cCycles: Cleanup, use ccl_restrict instead of ccl_restrict_ptr There were following issues with ccl_restrict_ptr: - We already had ccl_restrict for all platforms. - It was secretly adding `const` qualifier to the declaration, which is quite weird since non-const pointer can also be declared as restricted. - We never in Blender are using foo_ptr or FooPtr type definitions, so not sure why we should introduce such a thing here. - It is absolutely wrong from semantic point of view to put pointer into the restrict macro -- const is a part of type, not part of hint for compiler that some pointer is never aliased.	2017-05-19 12:41:03 +02:00
Lukas Stockner	740cd28748	Cycles Denoising: Add more robust outlier heuristic to avoid artifacts Extremely bright pixels in the rendered image cause the denoising algorithm to produce extremely noticable artifacts. Therefore, a heuristic is needed to exclude these pixels from the filtering process. The new approach calculates the 75% percentile of the 5x5 neighborhood of each pixel and flags the pixel if it is more than twice as bright. During the reconstruction process, flagged pixels are skipped. Therefore, they don't cause any problems for neighboring pixels, and the outlier pixels themselves are replaced by a prediction of their actual value based on their feature pass values and the neighboring pixels. Therefore, the denoiser now also works as a smarter despeckling filter that uses a more accurate prediction of the pixel instead of a simple average. This can be used even if denoising isn't wanted by setting the denoising radius to 1.	2017-05-18 21:55:56 +02:00
Lukas Stockner	43b374e8c5	Cycles: Implement denoising option for reducing noise in the rendered image This commit contains the first part of the new Cycles denoising option, which filters the resulting image using information gathered during rendering to get rid of noise while preserving visual features as well as possible. To use the option, enable it in the render layer options. The default settings fit a wide range of scenes, but the user can tweak individual settings to control the tradeoff between a noise-free image, image details, and calculation time. Note that the denoiser may still change in the future and that some features are not implemented yet. The most important missing feature is animation denoising, which uses information from multiple frames at once to produce a flicker-free and smoother result. These features will be added in the future. Finally, thanks to all the people who supported this project: - Google (through the GSoC) and Theory Studios for sponsoring the development - The authors of the papers I used for implementing the denoiser (more details on them will be included in the technical docs) - The other Cycles devs for feedback on the code, especially Sergey for mentoring the GSoC project and Brecht for the code review! - And of course the users who helped with testing, reported bugs and things that could and/or should work better!	2017-05-07 14:40:58 +02:00
Hristo Gueorguiev	6bf4115c13	Cycles: Split kernel - sort shaders Reduce thread divergence in kernel_shader_eval. Rays are sorted in blocks of 2048 according to shader->id. On R9 290 Classroom is ~30% faster, and Pabellon Barcelone is ~8% faster. No sorting for CUDA split kernel. Reviewers: sergey, maiself Reviewed By: maiself Differential Revision: https://developer.blender.org/D2598	2017-05-03 15:30:45 +02:00
Mai Lavelle	915766f42d	Cycles: Branched path tracing for the split kernel This implements branched path tracing for the split kernel. General approach is to store the ray state at a branch point, trace the branched ray as normal, then restore the state as necessary before iterating to the next part of the path. A state machine is used to advance the indirect loop state, which avoids the need to add any new kernels. Each iteration the state machine recreates as much state as possible from the stored ray to keep overall storage down. Its kind of hard to keep all the different integration loops in sync, so this needs lots of testing to make sure everything is working correctly. We should probably start trying to deduplicate the integration loops more now. Nonbranched BMW is ~2% slower, while classroom is ~2% faster, other scenes could use more testing still. Reviewers: sergey, nirved Reviewed By: nirved Subscribers: Blendify, bliblubli Differential Revision: https://developer.blender.org/D2611	2017-05-02 14:26:46 -04:00
Sergey Sharybin	0579eaae1f	Cycles: Make all #include statements relative to cycles source directory The idea is to make include statements more explicit and obvious where the file is coming from, additionally reducing chance of wrong header being picked up. For example, it was not obvious whether bvh.h was refferring to builder or traversal, whenter node.h is a generic graph node or a shader node and cases like that. Surely this might look obvious for the active developers, but after some time of not touching the code it becomes less obvious where file is coming from. This was briefly mentioned in T50824 and seems @brecht is fine with such explicitness, but need to agree with all active developers before committing this. Please note that this patch is lacking changes related on GPU/OpenCL support. This will be solved if/when we all agree this is a good idea to move forward. Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner Reviewed By: lukasstockner97, maiself, nirved, dingto Subscribers: brecht Differential Revision: https://developer.blender.org/D2586	2017-03-29 13:41:11 +02:00
Sergey Sharybin	1cad64900e	Cycles: Define ccl_local variables in kernel functions Declaring ccl_local in a device function is not supported by certain compilers.	2017-03-16 11:27:17 +01:00
Mai Lavelle	96868a3941	Fix T50888: Numeric overflow in split kernel state buffer size calculation Overflow led to the state buffer being too small and the split kernel to get stuck doing nothing forever.	2017-03-11 05:39:28 -05:00
Mai Lavelle	4a2cde3f0e	Cycles: Enable SSS and volumes for CUDA and Nvidia OpenCL split kernel	2017-03-10 02:09:41 -05:00

1 2

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

64 Commits