blender-archive

Archived

Author	SHA1	Message	Date
Mai Lavelle	2cae58524c	Cycles: Improve memory usage of CPU split kernel by using smaller global size	2017-03-17 01:54:10 -04:00
Mai Lavelle	96868a3941	Fix T50888: Numeric overflow in split kernel state buffer size calculation Overflow led to the state buffer being too small and the split kernel to get stuck doing nothing forever.	2017-03-11 05:39:28 -05:00
Mai Lavelle	64751552f7	Cycles: Fix indentation	2017-03-08 01:31:32 -05:00
Mai Lavelle	306034790f	Cycles: Calculate size of split state buffer kernel side By calculating the size of the state buffer in the kernel rather than the host less code is needed and the size actually reflects the requested features. Will also be a little faster in some cases because of larger global work size.	2017-03-08 01:31:30 -05:00
Mai Lavelle	b78e543af9	Cycles: Add names to buffer allocations This is to help debug and track memory usage for generic buffers. We have similar for textures already since those require a name, but for buffers the name is only for debugging proposes.	2017-03-08 01:24:55 -05:00
Mai Lavelle	0892352bfe	Cycles: CPU implementation of split kernel	2017-03-08 00:52:41 -05:00
Mai Lavelle	0f56f7a811	Cycles: Allow device_memory to be used directly This is useful for when theres no host side memory attched to the buffer	2017-03-08 00:52:41 -05:00
Lukas Stockner	a2ebc5268f	Cycles: Refactor Progress system to provide better estimates The Progress system in Cycles had two limitations so far: - It just counted tiles, but ignored their size. For example, when rendering a 600x500 image with 512x512 tiles, the right 88x500 tile would count for 50% of the progress, although it only covers 15% of the image. - Scene update time was incorrectly counted as rendering time - therefore, the remaining time started very long and gradually decreased. This patch fixes both problems: First of all, the Progress now has a function to ignore time spans, and that is used to ignore scene update time. The larger change is the tile size: Instead of counting samples per tile, so that the final value is num_samplesnum_tiles, the code now counts every sample for every pixel, so that the final value is num_samplesnum_pixels. Along with that, some unused variables were removed from the Progress and Session classes. Reviewers: brecht, sergey, #cycles Subscribers: brecht, candreacchio, sergey Differential Revision: https://developer.blender.org/D2214	2016-12-03 05:02:21 +01:00
Mai Lavelle	4388b29e98	Cycles: Add human readable sizes to debug output Some of these values can get quite large and are hard to read, adding this makes it easy to read them at a glance. Reviewed By: sergey Differential Revision: https://developer.blender.org/D2039	2016-05-31 06:13:54 -04:00
Sergey Sharybin	7b356a8565	Cycles: Reduce amount of malloc() calls from the kernel This commit makes it so malloc() is only happening once per volume and once per transparent shadow query (per thread), improving scalability of the code to multiple CPU cores. Hard to measure this with a low-bottom i7 here currently, but from quick tests seems volume sampling gave about 3-5% speedup. The idea is to store allocated memory in kernel globals, which are per thread on CPU already. Reviewers: dingto, juicyfruit, lukasstockner97, maiself, brecht Reviewed By: brecht Subscribers: Blendify, nutel Differential Revision: https://developer.blender.org/D1996	2016-05-18 10:14:24 +02:00
Thomas Dinges	4a4f043bc4	Cycles: Add support for single channel float textures on CPU. Until now, single channel textures were packed into a float4, wasting 3 floats per pixel. Memory usage of such textures is now reduced by 3/4. Voxel Attributes such as density, flame and heat benefit from this, but also Bumpmaps with one channel. This commit also includes some cleanup and code deduplication for image loading. Example Smoke render from Cosmos Laundromat: http://www.pasteall.org/pic/show.php?id=102972 Memory here went down from ~600MB to ~300MB. Reviewers: #cycles, brecht Differential Revision: https://developer.blender.org/D1981	2016-05-11 21:58:34 +02:00
Thomas Dinges	d6555d936c	Cleanup: Avoid duplicative defines for CPU textures, use the ones from util_texture.h Also includes some further byte -> byte4 renaming, missed that in last commit.	2016-05-09 09:16:41 +02:00
Sergey Sharybin	f25f7c8030	Cycles: Re-implement some utilities to avoid use of boost The title says it all actually, the idea is to make Cycles only requiring Boost via 3rd party dependencies like OIIO and OSL. So now there are only few places which still uses Boost: - Foreach, function bindings and threading primitives. Those we can easily get rid with C++11 bump (which seems inevitable sooner or later if we'll want ot use newer LLVM for OSL), - Networking devices There's no quick solution for those currently, but there are some patches around which improves serialization. Reviewers: juicyfruit, mont29, campbellbarton, brecht, dingto Reviewed By: brecht, dingto Differential Revision: https://developer.blender.org/D1764	2016-02-06 19:19:20 +01:00
Dalai Felinto	9a76354585	Cycles-Bake: Custom Baking passes The combined pass is built with the contributions the user finds fit. It is useful for lightmap baking, as well as non-view dependent effects baking. The manual will be updated once we get closer to the 2.77 release. Meanwhile the new page can be found here: http://dalaifelinto.com/blender-manual/render/cycles/baking.html Reviewers: sergey, brecht Differential Revision: https://developer.blender.org/D1674	2016-01-15 13:00:56 -02:00
Sergey Sharybin	ac7aefd7c2	Cycles: Use special debug panel to fine-tune debug flags This panel is only visible when debug_value is set to 256 and has no affect in other cases. However, if debug value is not set to this value, environment variables will be used to control which features are enabled, so there's no visible changes to anyone in fact. There are some changes needed to prevent devices re-enumeration on every Cycles session create. Reviewers: juicyfruit, lukasstockner97, dingto, brecht Reviewed By: lukasstockner97, dingto Differential Revision: https://developer.blender.org/D1720	2016-01-12 16:21:30 +05:00
Sergey Sharybin	944b6322e6	Cycles: Log whch optimizations are used for CPU kernels Not fully thread-safe, but is rather harmless. Just some messages might be logged several times.	2016-01-06 20:25:19 +05:00
Sergey Sharybin	e2846c999a	Cycles: Fix stupid mistake which was assining kernel function in a loop	2016-01-06 20:05:33 +05:00
Sergey Sharybin	3918c8b9a5	Cycles: Optionally output luminance from the shader evaluation kernel This makes it possible to move some parts of evaluation from host to the device and hopefully reduce memory usage by avoid having full RGBA buffer on the host. Reviewers: juicyfruit, lukasstockner97, brecht Reviewed By: lukasstockner97, brecht Differential Revision: https://developer.blender.org/D1702	2015-12-30 19:04:04 +05:00
Sergey Sharybin	3fba620858	Cycles: Prepare for more image extension types support Basically just replace boolean periodic flag with extension type enum in the device API.	2015-07-28 14:14:24 +02:00
Sergey Sharybin	f2c54df625	Cycles: Expose image image extension mapping to the image manager Currently only two mappings are supported by API, which is Repeat (old behavior) and new Clip behavior. Internally this extension is being converted to periodic flag which was already supported but wasn't exposed. There's no support for OpenCL yet because of the way how we pack images into a single texture. Those settings are not exposed to UI or anywhere else and there should be no functional changes so far.	2015-07-21 21:58:19 +02:00
Sergey Sharybin	35812e65f4	Cycles: Fix compilation error on windows after recent logging changes	2015-04-10 22:35:10 +05:00
Sergey Sharybin	2f5dd83759	Cycles: Add some statistics logging Covers number of entities in the scene (objects, meshes etc), also reports sizes of textures being allocated.	2015-04-10 15:37:49 +05:00
Sergey Sharybin	5ff132182d	Cycles: Code cleanup, spaces around keywords This inconsistency drove me totally crazy, it's really confusing when it's inconsistent especially when you work on both Cycles and Blender sides. Shouldn;t cause merge PITA, it's whitespace changes only, Git should be able to merge it nicely.	2015-03-28 00:15:15 +05:00
Sergey Sharybin	585dd26120	Cycles: Code cleanup, prepare for strict C++ flags	2015-03-27 18:23:31 +05:00
Sergey Sharybin	7f406a53c7	Cycles: Cleanup for indentation in device_cpu.cpp Perhaps became broken after rather recent change about which entry point to kernel to use.	2015-02-19 19:05:04 +05:00
Sergey Sharybin	a922be9270	Cycles: Repot CPU and CUDA capabilities to system info operator For CPU it gives available instructions set (SSE, AVX and so). For GPU CUDA it reports most of the attribute values returned by cuDeviceGetAttribute(). Ideally we need to only use set of those which are driver-specific (so we don't clutter system info with values which we can get from GPU specifications and be sure they stay the same because driver can't affect on them).	2015-01-06 14:13:21 +05:00
Thomas Dinges	ee36e75b85	Cleanup: Fix Cycles Apache header. This was already mixed a bit, but the dot belongs there.	2014-12-25 02:50:24 +01:00
Bastien Montagne	c14d34322b	Fix typo breaking compilation with SSE2. Spotted by sybrenstuvel (Sybren Stüvel), thanks!	2014-11-02 23:01:09 +01:00
Martijn Berger	4b33667b93	Deduplicate some code by using a function pointer to the real kernel This has no performance impact what so ever and is already used in the adaptive sampling patch	2014-10-30 10:23:44 +01:00
Sergey Sharybin	cd6129d1ff	Cycles: Workaround dead-slow expf() on 64bit linux Single precision exponent on 64bit linux tends to be order of magnitude slower than double precision version even with single<->double precision conversion. Some feedback in the mailing lists also suggests that logf() is also slow, but this i didn't confirm here in the studio yet. Depending on the shader setup it gives ~3% with the secret agent shot and up to around 15% with the bmw scene here.	2014-10-06 12:36:46 +02:00
Sergey Sharybin	fbed2047c8	Fix wrong track of the memory when doing device vector resize before freeing it This is rather legit case which happens i.e. when having persistent images enabled and session is updating the lookup tables. Now device_memory keeps track of amount of memory being allocated on the device, which makes freeing using the proper allocated size, not the CPU side buffer size.	2014-09-04 17:25:12 +06:00
Dalai Felinto	8d3cc431d7	Fix T41471 Cycles Bake: Setting small tile size results in wrong bake with stripes rather than the expected noise pattern This problem was introduced in `983cbafd18` Basically the issue is that we were not getting a unique index in the baking routine for the RNG (random number generator). Reviewers: sergey Differential Revision: https://developer.blender.org/D749	2014-08-19 11:40:33 +02:00
Dalai Felinto	fc55c41bba	Cycles Bake: show progress bar during bake Baking progress preview is not possible, in parts due to the way the API was designed. But at least you get to see the progress bar while baking. Reviewers: sergey Differential Revision: https://developer.blender.org/D656	2014-07-25 11:42:53 -03:00
Thomas Dinges	866c7fb6e6	Cycles: Add an AVX2 CPU kernel. This kernel is compiled with AVX2, FMA3, and BMI compiler flags. At the moment only Intel Haswell benefits from this, but future AMD CPUs will have these instructions as well. Makes rendering on Haswell CPUs a few percent faster, only benchmarked with clang on OS X though. Part of my GSoC 2014.	2014-06-13 22:26:20 +02:00
Brecht Van Lommel	e4e58d4612	Fix T40370: cycles CUDA baking timeout with high number of AA samples. Now baking does one AA sample at a time, just like final render. There is also some code for shader antialiasing that solves T40369 but it is disabled for now because there may be unpredictable side effects.	2014-06-06 15:39:04 +02:00
Brecht Van Lommel	0075efc4d2	Fix T40306: cycles baking not distributing work among CPU cores well.	2014-05-26 13:51:11 +02:00
Dalai Felinto	eec3eaba08	Cycles Bake Expand Cycles to use the new baking API in Blender. It works on the selected object, and the panel can be accessed in the Render panel (similar to where it is for the Blender Internal). It bakes for the active texture of each material of the object. The active texture is currently defined as the active Image Texture node present in the material nodetree. If you don't want the baking to override an existent material, make sure the active Image Texture node is not connected to the nodetree. The active texture is also the texture shown in the viewport in the rendered mode. Remember to save your images after the baking is complete. Note: Bake currently only works in the CPU Note: This is not supported by Cycles standalone because a lot of the work is done in Blender as part of the operator only, not the engine (Cycles). Documentation: http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Bake Supported Passes: ----------------- Data Passes * Normal * UV * Diffuse/Glossy/Transmission/Subsurface/Emit Color Light Passes * AO * Combined * Shadow * Diffuse/Glossy/Transmission/Subsurface/Emit Direct/Indirect * Environment Review: D421 Reviewed by: Campbell Barton, Brecht van Lommel, Sergey Sharybin, Thomas Dinge Original design by Brecht van Lommel. The entire commit history can be found on the branch: bake-cycles	2014-05-02 21:19:09 -03:00
Brecht Van Lommel	a2e4ebd36a	Cycles code internals: add CPU kernel support for 3D image textures.	2014-03-29 13:03:48 +01:00
Martijn Berger	dd2dca2f7e	Add support for multiple interpolation modes on cycles image textures All textures are sampled bi-linear currently with the exception of OSL there texture sampling is fixed and set to smart bi-cubic. This patch adds user control to this setting. Added: - bits to DNA / RNA in the form of an enum for supporting multiple interpolations types - changes to the image texture node drawing code ( add enum) - to ImageManager (this needs to know to allocate second texture when interpolation type is different) - to node compiler (pass on interpolation type) - to device tex_alloc this also needs to get the concept of multiple interpolation types - implementation for doing non interpolated lookup for cuda and cpu - implementation where we pass this along to osl ( this makes OSL also do linear untill I add smartcubic to the interface / DNA/ RNA) Reviewers: brecht, dingto Reviewed By: brecht CC: dingto, venomgfx Differential Revision: https://developer.blender.org/D317	2014-03-07 23:16:33 +01:00
Thomas Dinges	de28a4d4b2	Cycles: Add an AVX kernel for CPU rendering. * AVX is available on Intel Sandy Bridge and newer and AMD Bulldozer and newer. * We don't use dedicated AVX intrinsics yet, but gcc auto vectorization gives a 3% performance improvement for Caminandes. Tested on an i5-3570, Linux x64. * No change for Windows yet, MSVC 2008 does not support AVX. Reviewed by: brecht Differential Revision: https://developer.blender.org/D216	2014-01-16 17:04:11 +01:00
Thomas Dinges	9351ac0d85	Cycles: Skip the compilation of the dedicated SSE2 kernel on x86-64, we can assume SSE2 here, so just re-use the regular one. Saves 500kb in the blender binary. Reviewed by: brecht Differential Revision: https://developer.blender.org/D199	2014-01-14 20:39:54 +01:00
Thomas Dinges	ce6dce3b13	Code cleanup / Cycles: else/if for SSE41 kernel functions.	2014-01-06 03:22:14 +01:00
Martijn Berger	85a0c5d4e1	Cycles: network render code updated for latest changes and improved This actually works somewhat now, although viewport rendering is broken and any kind of network error or connection failure will kill Blender. * Experimental WITH_CYCLES_NETWORK cmake option * Networked Device is shown as an option next to CPU and GPU Compute * Various updates to work with the latest Cycles code * Locks and thread safety for RPC calls and tiles * Refactored pointer mapping code * Fix error in CPU brand string retrieval code This includes work by Doug Gale, Martijn Berger and Brecht Van Lommel. Reviewers: brecht Differential Revision: http://developer.blender.org/D36	2013-12-07 12:26:58 +01:00
Martijn Berger	e3a79258d1	Cycles: test code for sse 4.1 kernel and alignment for some vector types. This is mostly work towards enabling the __KERNEL_SSE__ option to start using SIMD operations for vector math operations. This 4.1 kernel performes about 8% faster with that option but overall is still slower than without the option. WITH_CYCLES_OPTIMIZED_KERNEL_SSE41 is the cmake flag for testing this kernel. Alignment of int3, int4, float3, float4 to 16 bytes seems to give a slight 1-2% speedup on tested systems with the current kernel already, so is enabled now.	2013-11-22 14:42:41 +01:00
Campbell Barton	48c1e0c0fc	spelling: use American spelling for canceled	2013-10-26 01:06:19 +00:00
Brecht Van Lommel	29f6616d60	Cycles: viewport render now takes scene color management settings into account, except for curves, that's still missing from the OpenColorIO GLSL shader. The pixels are stored in a half float texture, converterd from full float with native GPU instructions and SIMD on the CPU, so it should be pretty quick. Using a GLSL shader is useful for GPU render because it avoids a copy through CPU memory.	2013-08-30 23:49:38 +00:00
Brecht Van Lommel	b9ce231060	Cycles: relicense GNU GPL source code to Apache version 2.0. More information in this post: http://code.blender.org/ Thanks to all contributes for giving their permission!	2013-08-18 14:16:15 +00:00
Thomas Dinges	9732c6283e	Cycles / CPU Rendering: * "Auto Detect" now again uses the umber of cores, instead number of cores + 1. This was added before we had Tile rendering and benchmarks on several systems showed that there is no gain with this now. There might be some slight difference (0.5% or so) slower/faster depending on the scene, but this is negligible.	2013-07-20 00:40:03 +00:00
Thomas Dinges	11707119de	Cycles: * Code cleanup, remove unused "resolution" variable from the DeviceTask class, was never used.	2013-05-14 21:18:20 +00:00
Thomas Dinges	a239700f43	Cycles: * Code cleanup, remove deprecated support_advanced_shading() functions. Left over from r43734.	2013-02-21 17:10:14 +00:00

1 2 3 4

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

177 Commits