blender-archive

Archived

Author	SHA1	Message	Date
Michael Jones	654e1e901b	Cycles: Use local atomics for faster shader sorting (enabled on Metal) This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high". Reviewed By: brecht Differential Revision: https://developer.blender.org/D16909	2023-02-06 11:18:26 +00:00
Campbell Barton	66dee44088	CMake: quiet references to undeclared variable warnings These warnings can reveal errors in logic, so quiet them by checking if the features are enabled before using variables or by assigning empty strings in some cases. - Check CMAKE_THREAD_LIBS_INIT is set before use as CMake docs note that this may be left unset if it's not needed. - Remove BOOST/OPENVDB/VULKAN references when disable. - Define INC_SYS even when empty. - Remove PNG_INC from freetype (not defined anywhere).	2023-01-19 17:10:42 +11:00
Michael Jones	77c3e67d3d	Cycles: Improved render start/stop responsiveness on Metal All kernel specialisation is now performed in the background regardless of kernel type, meaning that the first render will be visible a few seconds sooner. The only exception is during benchmark warm up, in which case we wait for all kernels to be cached. When stopping a render, we call a new `cancel()` method on the device which causes any outstanding compilation work to be cancelled, and we destroy the device in a detached thread so that any stale queued compilations can be safely purged without blocking the UI for longer than necessary. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16371	2023-01-04 16:00:53 +00:00
Chris Blackbourn	60523ea523	Cleanup: format	2022-11-16 12:59:47 +13:00
Patrick Mours	a859837cde	Cleanup: Move OptiX denoiser code from device into denoiser class Cycles already treats denoising fairly separate in its code, with a dedicated `Denoiser` base class used to describe denoising behavior. That class has been fully implemented for OIDN (`denoiser_oidn.cpp`), but for OptiX was mostly empty (`denoiser_optix.cpp`) and denoising was instead implemented in the OptiX device. That meant denoising code was split over various files and directories, making it a bit awkward to work with. This patch moves the OptiX denoising implementation into the existing `OptiXDenoiser` class, so that everything is in one place. There are no functional changes, code has been mostly moved as-is. To retain support for potential other denoiser implementations based on a GPU device in the future, the `DeviceDenoiser` base class was kept and slightly extended (and its file renamed to `denoiser_gpu.cpp` to follow similar naming rules as `path_trace_work_*.cpp`). Differential Revision: https://developer.blender.org/D16502	2022-11-15 15:50:01 +01:00
Campbell Barton	afc091c3c4	Cleanup: spelling in comments	2022-11-01 12:24:58 +11:00
Michael Jones (Apple)	8dd7b5b26b	Cycles: Metal integrator state size tuning This patch tunes the integrator state sizing for Metal (`num_concurrent_states` and `num_concurrent_busy_states`). On all GPUs architecture, we adjust the busy:total states ratio to be 1:4 which gives better rendering performance than the previous 1:16 ratio (independent of total state count). This gives a small performance uplift (e.g. 2-3% on M1 Ultra). Additionally for M2 architectures, we double the overall state size if there is available headroom. Inclusive of the first change, we can expect uplift of close to 10% in future, as this results in larger dispatch sizes and minimises work submission overheads. In order to make an accurate determination of available headroom, we defer the calculation of `num_concurrent_states` and `num_concurrent_busy_states` until the time of integrator state allocation (i.e. after all of the scene data has been allocated). We also refactor `alloc_integrator_soa` to calculate an exact single-state-size in a first pass, right before allocating the integrator SoA buffers in a second pass. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16313	2022-10-24 17:14:33 +01:00
Sebastian Herholz	2006c3ed10	Fix T101529: Blender crashes when using Path Guiding	2022-10-18 13:59:12 +02:00
Lukas Stockner	95aac5df73	Fix T101651: Cycles crashes when failing to initialize render device The issue here was that PathTraceWork was set up before checking if any error occurred, and it didn't account for the dummy device so it called a non-implemented function. This fix therefore avoids creating PathTraceWork for dummy devices and checks for device creation errors earlier in the process.	2022-10-10 17:55:08 +02:00
Campbell Barton	210f4db81c	Cleanup: spelling in comments	2022-10-10 11:22:41 +11:00
Campbell Barton	6d1d1bf2b1	Cleanup: spelling in comments Also add missing task ID.	2022-09-28 09:41:31 +10:00
Hans Goudey	b145cc9d36	Cleanup: Unused variable warning with path guiding turned off	2022-09-27 15:00:37 -05:00
Sebastian Herhoz	75a6d3abf7	Cycles: add Path Guiding on CPU through Intel OpenPGL This adds path guiding features into Cycles by integrating Intel's Open Path Guiding Library. It can be enabled in the Sampling > Path Guiding panel in the render properties. This feature helps reduce noise in scenes where finding a path to light is difficult for regular path tracing. The current implementation supports guiding directional sampling decisions on surfaces, when the material contains a least one diffuse component, and in volumes with isotropic and anisotropic Henyey-Greenstein phase functions. On surfaces, the guided sampling decision is proportional to the product of the incident radiance and the normal-oriented cosine lobe and in volumes it is proportional to the product of the incident radiance and the phase function. The incident radiance field of a scene is learned and updated during rendering after each per-frame rendering iteration/progression. At the moment, path guiding is only supported by the CPU backend. Support for GPU backends will be added in future versions of OpenPGL. Ref T92571 Differential Revision: https://developer.blender.org/D15286	2022-09-27 15:56:32 +02:00
Brecht Van Lommel	3a605b23d0	Fix T100708: Cycles bake of diffuse/glossy color not outputting alpha	2022-08-31 20:51:50 +02:00
Brecht Van Lommel	6a4f4810f3	Fix T100246: Cycles GPU render error when adding AO node during viewport render	2022-08-18 20:04:22 +02:00
Patrick Mours	515a15f200	Fix syntax error introduced in previous commit	2022-08-12 16:13:09 +02:00
Patrick Mours	79787bf8e1	Cycles: Improve denoiser update performance when rendering with multiple GPUs This patch causes the render buffers to be copied to the denoiser device only once before denoising and output/display is then fed from that single buffer on the denoiser device. That way usually all but one copy (from all the render devices to the denoiser device) can be eliminated, provided that the denoiser device is also the display device (in which case interop is used to update the display). As such this patch also adds some logic that tries to ensure the chosen denoiser device is the same as the display device. Differential Revision: https://developer.blender.org/D15657	2022-08-12 16:00:54 +02:00
Brecht Van Lommel	523bbf7065	Cycles: generalize shader sorting / locality heuristic to all GPU devices This was added for Metal, but also gives good results with CUDA and OptiX. Also enable it for future Apple GPUs instead of only M1 and M2, since this has been shown to help across multiple GPUs so the better bet seems to enable rather than disable it. Also moves some of the logic outside of the Metal device code, and always enables the code in the kernel since other devices don't do dynamic compile. Time per sample with OptiX + RTX A6000: new old barbershop_interior 0.0730s 0.0727s bmw27 0.0047s 0.0053s classroom 0.0428s 0.0464s fishy_cat 0.0102s 0.0108s junkshop 0.0366s 0.0395s koro 0.0567s 0.0578s monster 0.0206s 0.0223s pabellon 0.0158s 0.0174s sponza 0.0088s 0.0100s spring 0.1267s 0.1280s victor 0.0524s 0.0531s wdas_cloud 0.0817s 0.0816s Ref D15331, T87836	2022-07-15 13:42:47 +02:00
Michael Jones (Apple)	4b1d315017	Cycles: Improve cache usage on Apple GPUs by chunking active indices This patch partitions the active indices into chunks prior to sorting by material in order to tradeoff some material coherence for better locality. On Apple Silicon GPUs (particularly higher end M1-family GPUs), we observe overall render time speedups of up to 15%. The partitioning is implemented by repeating the range of `shader_sort_key` for each partition, and encoding a "locator" key which distributes the indices into sorted chunks. Reviewed By: brecht Differential Revision: https://developer.blender.org/D15331	2022-07-14 14:26:18 +01:00
Xavier Hallade	a02992f131	Cycles: Add support for rendering on Intel GPUs using oneAPI This patch adds a new Cycles device with similar functionality to the existing GPU devices. Kernel compilation and runtime interaction happen via oneAPI DPC++ compiler and SYCL API. This implementation is primarly focusing on Intel® Arc™ GPUs and other future Intel GPUs. The first supported drivers are 101.1660 on Windows and 22.10.22597 on Linux. The necessary tools for compilation are: - A SYCL compiler such as oneAPI DPC++ compiler or https://github.com/intel/llvm - Intel® oneAPI Level Zero which is used for low level device queries: https://github.com/oneapi-src/level-zero - To optionally generate prebuilt graphics binaries: Intel® Graphics Compiler All are included in Linux precompiled libraries on svn: https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for Windows precompiled binaries but for the graphics compiler, available as "Intel® Graphics Offline Compiler for OpenCL™ Code" from https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html, for which path can be set as OCLOC_INSTALL_DIR. Being based on the open SYCL standard, this implementation could also be extended to run on other compatible non-Intel hardware in the future. Reviewed By: sergey, brecht Differential Revision: https://developer.blender.org/D15254 Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com> Co-authored-by: Stefan Werner <stefan.werner@intel.com>	2022-06-29 12:58:04 +02:00
Brecht Van Lommel	ff1883307f	Cleanup: renaming and consistency for kernel data * Rename "texture" to "data array". This has not used textures for a long time, there are just global memory arrays now. (On old CUDA GPUs there was a cache for textures but not global memory, so we used to put all data in textures.) * For CUDA and HIP, put globals in KernelParams struct like other devices. * Drop __ prefix for data array names, no possibility for naming conflict now that these are in a struct.	2022-06-20 12:30:48 +02:00
Brecht Van Lommel	2c1bffa286	Cleanup: add verbose logging category names instead of numbers And use them more consistently than before.	2022-06-17 14:08:14 +02:00
Brecht Van Lommel	f2cd7e08fe	Fix Cycles MNEE not working for Metal Move MNEE to own kernel, separate from shader ray-tracing. This does introduce the limitation that a shader can't use both MNEE and AO/bevel, but that seems like the better trade-off for now. We can experiment with bigger kernel organization changes later. Differential Revision: https://developer.blender.org/D15070	2022-05-31 17:24:43 +02:00
Sybren A. Stüvel	3e782bba71	Cleanup: Cycles, avoid 'parameter unused' warning Avoid 'parameter unused' warning when building Cycles without OpenImageDenoise. No functional changes. Over-the-shoulder reviewed by @sergey	2022-05-11 18:00:49 +02:00
Brecht Van Lommel	ac9ebc9de3	Fix Cycles division by zero in material preview render If the render gets cancelled before the first sample finishes.	2022-05-04 20:01:04 +02:00
Brecht Van Lommel	0c317e23bf	Cleanup: fix various Cycles build warnings with non-default options * Float/double promotion warnings were mainly meant for avoiding slow operatiosn in the kernel. Limit it to that to avoid hard to fix warnings in Hydra. * Const warnings in Hydra iterators. * Unused variable warnings when building without glog. * Wrong camera enum comparisons in assert. * PASS_UNUSED is not a pass type, only for pass offsets.	2022-04-29 17:39:04 +02:00
Brecht Van Lommel	2cb76a6c8d	Cleanup: consistently use parallel_for without tbb namespace in Cycles	2022-04-18 19:14:36 +02:00
Brecht Van Lommel	41b3feea85	Fix Cycles build error with latest TBB after recent changes From changes in `869a46df29`, ref D14454	2022-04-18 18:49:35 +02:00
Michael Jones	869a46df29	Cycles fp consistency for Apple Silicon CPUs Propagate the fp settings from the main thread to all the worker threads (the fp settings includes the FZ settings among other things) - this guarantees consistency in execution of floating point math regardless if its executed in tbb thread arena or on main thread Add FZ mode to arm64/aarch64 in parallel to the way its been done on intel processors, currently compiling for arm target does not set this mode at all, hence potentially runs slower and with possible results mismatch with intel x86. Reviewed By: brecht Differential Revision: https://developer.blender.org/D14454	2022-04-12 19:43:47 +01:00
Lukas Stockner	ad35453cd1	Cycles: Add support for light groups Light groups are a type of pass that only contains lighting from a subset of light sources. They are created in the View layer, and light sources (lamps, objects with emissive materials and/or the environment) can be assigned to a group. Currently, each light group ends up generating its own version of the Combined pass. In the future, additional types of passes (e.g. shadowcatcher) might be getting their own per-lightgroup versions. The lightgroup creation and assignment is not Cycles-specific, so Eevee or external render engines could make use of it in the future. Note that Lightgroups are identified by their name - therefore, the name of the Lightgroup in the View Layer and the name that's set in an object's settings must match for it to be included. Currently, changing a Lightgroup's name does not update objects - this is planned for the future, along with other features such as denoising for light groups and viewing them in preview renders. Original patch by Alex Fuller (@mistaed), with some polishing by Lukas Stockner (@lukasstockner97). Differential Revision: https://developer.blender.org/D12871	2022-04-02 06:14:27 +02:00
Patrick Mours	d350976ba0	Cycles: Add Hydra render delegate This patch adds a Hydra render delegate to Cycles, allowing Cycles to be used for rendering in applications that provide a Hydra viewport. The implementation was written from scratch against Cycles X, for integration into the Blender repository to make it possible to continue developing it in step with the rest of Cycles. For this purpose it follows the style of the rest of the Cycles code and can be built with a CMake option (`WITH_CYCLES_HYDRA_RENDER_DELEGATE=1`) similar to the existing standalone version of Cycles. Since Hydra render delegates need to be built against the exact USD version and other dependencies as the target application is using, this is intended to be built separate from Blender (`WITH_BLENDER=0` CMake option) and with support for library versions different from what Blender is using. As such the CMake build scripts for Windows had to be modified slightly, so that the Cycles Hydra render delegate can e.g. be built with MSVC 2017 again even though Blender requires MSVC 2019 now, and it's possible to specify custom paths to the USD SDK etc. The codebase supports building against the latest USD release 22.03 and all the way back to USD 20.08 (with some limitations). Reviewed By: brecht, LazyDodo Differential Revision: https://developer.blender.org/D14398	2022-03-23 16:39:05 +01:00
Aaron Carlisle	91dbc28363	Cleanup: clang format	2022-03-13 00:49:41 -05:00
Brecht Van Lommel	62a0984d72	Cleanup: fix source typos homogenous->homogeneous Contributed by luzpaz. Differential Revision: https://developer.blender.org/D14306	2022-03-11 18:27:58 +01:00
Campbell Barton	8b06c524d2	Cleanup: spelling in comments, function name	2022-03-04 10:31:11 +11:00
Campbell Barton	66c0fe5b23	Cleanup: correction to repeated word removal & correct spelling	2022-02-23 20:47:14 +11:00
Campbell Barton	7393cc1db7	Cleanup: Remove repeated word in comments	2022-02-23 18:24:37 +11:00
Sergey Sharybin	303b566b10	Merge branch 'blender-v3.1-release'	2022-02-18 15:32:24 +01:00
Sergey Sharybin	e4b7d52fe4	Fix graphics interop resources leak in Cycles When new display driver is given to the PathTrace ensure that there are no GPU resources used from it by the work. This solves graphics interop descriptors leak. This aqlso fixes Invalid graphics context in cuGraphicsUnregisterResource error when doing final render on the display GPU. Fixes T95837: Regression: GPU memory accumulation in Cycles render Fixes T95733: Cycles Cuda/Optix error message with multi GPU devices. (Invalid graphics context in cuGraphicsUnregisterResource) Fixes T95651: GPU error (Invalid graphics context in cuGraphicsUnregisterResource) Fixes T95631: VRAM is not being freed when rendering (Invalid graphics context in cuGraphicsUnregisterResource) Fixes T89747: Cycles Render - Textures Disappear then Crashes the Render Maniphest Tasks: T95837, T95733, T95651, T95631, T89747 Differential Revision: https://developer.blender.org/D14146	2022-02-18 15:26:15 +01:00
Brecht Van Lommel	9cfc7967dd	Cycles: use SPDX license headers * Replace license text in headers with SPDX identifiers. * Remove specific license info from outdated readme.txt, instead leave details to the source files. * Add list of SPDX license identifiers used, and corresponding license texts. * Update copyright dates while we're at it. Ref D14069, T95597	2022-02-11 17:47:34 +01:00
Campbell Barton	c434782e3a	File headers: SPDX License migration Use a shorter/simpler license convention, stops the header taking so much space. Follow the SPDX license specification: https://spdx.org/licenses - C/C++/objc/objc++ - Python - Shell Scripts - CMake, GNUmakefile While most of the source tree has been included - `./extern/` was left out. - `./intern/cycles` & `./intern/atomic` are also excluded because they use different header conventions. doc/license/SPDX-license-identifiers.txt has been added to list SPDX all used identifiers. See P2788 for the script that automated these edits. Reviewed By: brecht, mont29, sergey Ref D14069	2022-02-11 09:14:36 +11:00
Campbell Barton	012e41fc8b	Cleanup: use our own conventions for tags in comments	2022-01-31 10:49:59 +11:00
Sergey Sharybin	430f71fce2	Fix insufficient CPU flags checks for Cycles OIDN Sometime throughout development some checks got lost during refactor. This change makes it so that if OIDN is not supported on the current CPU Cycles will report an error and stop rendering. This behavior is similar to when an OptiX denoiser is requested and there is no OptiX compatible device available. The easiest way to verify this change is to force return false from the `openimagedenoise_supported()`. Fixes Cycles part of the T94127. Differential Revision: https://developer.blender.org/D13944	2022-01-28 14:28:04 +01:00
Brecht Van Lommel	04c3b08518	Fix T94355: Cycles wrong GPU bake with adaptive sampling	2022-01-24 19:18:11 +01:00
Brecht Van Lommel	1ac2d2dcb6	Fix T93711: Cycles diffuse/glossy baking does not write alpha With the change to use render passes internally the alpha channel got lost. Add support for these render passes to output an alpha channel for baking.	2022-01-20 22:32:35 +01:00
Campbell Barton	74c896c081	Cleanup: typos in comments, remove libnumaapi reference	2022-01-10 13:47:12 +11:00
Brecht Van Lommel	ae28d90578	Fix T93350: Cycles renders shows black during rendering huge resolutions The root of the issue is caused by Cycles ignoring OpenGL limitation on the maximum resolution of textures: Cycles was allocating texture of the final render resolution. It was exceeding limitation on certain GPUs and driver. The idea is simple: use multiple textures for the display, each of which will fit into OpenGL limitations. There is some code which allows the display driver to know when to start the new tile. Also added some code to allow force graphics interop to be re-created. The latter one ended up not used in the final version of the patch, but it might be helpful for other drivers implementation. The tile size is limited to 8K now as it is the safest size for textures on many GPUs and OpenGL drivers. This is an updated fix with a workaround for freezing with the NVIDIA driver on Linux. Differential Revision: https://developer.blender.org/D13385	2022-01-07 17:20:04 +01:00
Brecht Van Lommel	3a4952e7c2	Fix Cycles updating display unnecessarily when stopping 3D viewport Debug code accidentally committed in `466b50d`. This was found while investigating issues with D13385.	2022-01-06 19:10:50 +01:00
Brecht Van Lommel	2229179faa	Revert "Cycles-X: Add hysteresis to resolution divider algorithm" This reverts commit `d8b4275162`. It causes reduced viewport render resolution. Revert for now until I have time to look into this more closely.	2021-12-16 18:29:27 +01:00
Andrii	b8f41825e8	Fix Cycles wrong adaptive sampling render when using sample offset Sample offset was not accounted for in the adaptive sampling code and caused issues, like immediately applied adaptive filtering, with non-zero values. Differential Revision: https://developer.blender.org/D13510	2021-12-09 20:54:41 +01:00
Alaska	d8b4275162	Cycles-X: Add hysteresis to resolution divider algorithm Adds hysteresis to the resolution divider algorithm to avoid having the resolution bounce around when on the boundary of two resolutions. Reviewed By: brecht, leesonw Differential Revision: https://developer.blender.org/D12385	2021-12-09 09:18:47 +01:00

1 2 3

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

123 Commits