blender-archive

Archived

Author	SHA1	Message	Date
Mai Lavelle	8c4d28cdb9	Fix T54400: Some GCN 1 cards available to select for use with Cycles Hainan was missing from the list of GCN 1 cards.	2018-04-03 23:15:07 -04:00
Brecht Van Lommel	ce3e0afe59	Fix T54001: AMD OpenCL fails with certain resolutions, after recent changes. We should actually be using CL_DEVICE_MEM_BASE_ADDR_ALIGN for sub buffers, previous change in this code was incorrect. Renamed the function now to make the specific purpose of this alignment clear, it's not required for data types in general.	2018-02-05 22:19:49 +01:00
Sergey Sharybin	8e1dd7ed81	Cycles: Remove unneeded include statements Also try to move them from headers to implementation files as much as possible.	2018-01-19 15:19:45 +01:00
Brecht Van Lommel	0fe41009f0	Fix T53830: Cycles OpenCL debug assert on macOS, This was probably harmless besides some unnecessary memory usage due to aligning allocations too much.	2018-01-19 11:35:07 +01:00
Lukas Stockner	fa3d50af95	Cycles: Improve denoising speed on GPUs with small tile sizes Previously, the NLM kernels would be launched once per offset with one thread per pixel. However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown. Therefore, the kernels are now launched in a single call that handles all offsets at once. This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory. On the other hand, of course, the smaller tiles significantly reduce the size of the memory. The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum. I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere. To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now. Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.	2017-11-30 07:37:08 +01:00
Lukas Stockner	40f528a7da	Cycles: Add per-tile render time debug pass Reviewers: sergey, brecht Differential Revision: https://developer.blender.org/D2920	2017-11-17 16:40:24 +01:00
Brecht Van Lommel	bd4bea3e98	Cycles: avoid reallocating tile denoising memory many times during render.	2017-11-09 20:28:00 +01:00
Brecht Van Lommel	5801ef71e4	Code refactor: device memory cleanups, preparing for mapped host memory.	2017-11-05 15:22:04 +01:00
Mai Lavelle	5cb8730689	Cycles: Add another limit to OpenCL memory usage Some drivers may report very large allocation sizes, which could cause unnecessary memory usage. This is now limited to 2gb which should still be enough to get the needed performance benefits without waste.	2017-11-02 08:14:21 -04:00
Brecht Van Lommel	34fe3f9c06	Code refactor: remove MEM_WRITE_ONLY, always use MEM_READ_WRITE. It's unlikely the driver can do useful optimizations with this, and if we sum multiple samples we are reading from the memory anyway.	2017-10-24 23:53:09 +02:00
Sergey Sharybin	eccd18a91f	Cycles: Fix compilation error without C++11	2017-10-24 11:14:01 +02:00
Brecht Van Lommel	070a668d04	Code refactor: move more memory allocation logic into device API. * Remove tex_* and pixels_* functions, replace by mem_. Add MEM_TEXTURE and MEM_PIXELS as memory types recognized by devices. * No longer create device_memory and call mem_* directly, always go through device_only_memory, device_vector and device_pixels.	2017-10-24 01:25:19 +02:00
Brecht Van Lommel	aa8b4c5d81	Code refactor: use device_only_memory and device_vector in more places.	2017-10-24 01:25:13 +02:00
Brecht Van Lommel	7ad9333fad	Code refactor: store device/interp/extension/type in each device_memory.	2017-10-24 01:03:59 +02:00
Brecht Van Lommel	57a0cb797d	Code refactor: avoid some unnecessary device memory copying.	2017-10-21 20:58:28 +02:00
Brecht Van Lommel	dc9eb8234f	Cycles: combined CPU + GPU rendering support. CPU rendering will be restricted to a BVH2, which is not ideal for raytracing performance but can be shared with the GPU. Decoupled volume shading will be disabled to match GPU volume sampling. The number of CPU rendering threads is reduced to leave one core dedicated to each GPU. Viewport rendering will also only use GPU rendering still. So along with the BVH2 usage, perfect scaling should not be expected. Go to User Preferences > System to enable the CPU to render alongside the GPU. Differential Revision: https://developer.blender.org/D2873	2017-10-21 20:13:44 +02:00
Brecht Van Lommel	92611dada6	Fix T53098, T53079: OpenCL world texture errors after recent changes.	2017-10-18 03:13:25 +02:00
Brecht Van Lommel	23098cda99	Code refactor: make texture code more consistent between devices. * Use common TextureInfo struct for all devices, except CUDA fermi. * Move image sampling code to kernels//kernel__image.h files. * Use arrays for data textures on Fermi too, so device_vector<Struct> works.	2017-10-07 14:53:14 +02:00
Brecht Van Lommel	fb99ea79f8	Code refactor: split displace/background into separate kernels, remove luma.	2017-10-05 17:57:58 +02:00
Brecht Van Lommel	12f4538205	Code refactor: use split variance calculation for mega kernels too. There is no significant difference in denoised benchmark scenes and denoising ctests, so might as well make it all consistent.	2017-10-04 21:11:14 +02:00
Brecht Van Lommel	e3e16cecc4	Code refactor: remove rng_state buffer and compute hash on the fly. A little faster on some benchmark scenes, a little slower on others, seems about performance neutral on average and saves a little memory.	2017-10-04 21:11:14 +02:00
Brecht Van Lommel	85ad248c36	Code cleanup: fix warning and improve terminology.	2017-08-12 13:18:05 +02:00
Sergey Sharybin	176ad9ecdd	Cycles: Remove ulong usage This is a bit confusing, especially when one mixes OpenCL code where ulong equals to uint64_t with CPU side code where ulong is expected to be something else from the naming. This commit makes it so we use explicit name, common on all platforms.	2017-08-09 14:08:58 +02:00
Mai Lavelle	55d28e604e	Cycles: Proper fix for recent OpenCL image crash Problem was that some code checks to see if device_pointer is null or not and the new allocator wasn't even setting the pointer to anything as it tracks memory location separately. Setting the pointer to non null keeps all users of device_pointer happy.	2017-08-09 04:27:39 -04:00
Sergey Sharybin	99c13519a1	Cycles: More fixes for Windows 32 bit - Apparently MSVC does not support compound literals in C++ (at least by the looks of it). - Not sure how opencl_device_assert was managing to set protected property of the Device class.	2017-08-08 22:32:51 +02:00
Sergey Sharybin	0e57282999	Cycles: Fix compilation error without C++11 Common folks, nobody considered master a C++11 only branch. Such decision is to be done officially and will involve changes in quite a few infrastructure related areas.	2017-08-08 17:02:26 +02:00
Mai Lavelle	ec8ae4d5e9	Cycles: Pack kernel textures into buffers for OpenCL Image textures were being packed into a single buffer for OpenCL, which limited the amount of memory available for images to the size of one buffer (usually 4gb on AMD hardware). By packing textures into multiple buffers that limit is removed, while simultaneously reducing the number of buffers that need to be passed to each kernel. Benchmarks were within 2%. Fixes T51554. Differential Revision: https://developer.blender.org/D2745	2017-08-08 07:12:04 -04:00
Sergey Sharybin	580741b317	Cycles: Cleanup, space after keyword	2017-08-07 14:47:51 +02:00
Sergey Sharybin	e26f61a2b5	Cycles: Disable OpenCL clFlush workarounds This is something which was reported to work fine by Mai, Benjamin and confirmed by myself. Disabling this workaround gains us some speedup: Before Now bmw27 04:28.42 04:07.79 classroom 09:26.48 08:54.53 fishy_cat 08:44.01 08:18.70 koro 09:17.98 08:57.18 pavillon_barcelone 12:26.64 11:52.81 Test environment is: - Ubuntu 16.04, with all updates installed - AMD RX 480 GPU - amdgpu pro driver version 17.10-450821	2017-07-11 12:16:58 +02:00
Sergey Sharybin	fee7f688c3	Cycles: Fix ambiguity in call of min() function	2017-07-07 10:40:19 +02:00
Mai Lavelle	9c3f1ad003	Cycles: Add artificial memory limit debug option for OpenCL	2017-07-06 05:25:46 -04:00
Mai Lavelle	f9963f29e8	Cycles: Dont allow global size to fall to zero	2017-07-05 20:19:15 -04:00
Mai Lavelle	222b96e5c7	Cycles: Detect out of memory before buffer allocation in OpenCL devices	2017-07-05 20:19:12 -04:00
Sergey Sharybin	d37dd97e45	Cycles: Pass string by const reference rather than by value Some of the functions might have been inlined, but others i don't see how that was possible (don't think virtual functions can be inlined here). In any case, better be explicitly optimal in the code.	2017-07-05 12:27:41 +02:00
Mai Lavelle	56dcfcce05	Cycles: Disable baking in mega kernel when not in use to improve build times	2017-06-29 23:07:18 -04:00
Hristo Gueorguiev	04530c9383	Cycles: adjust supported driver version for AMD GPUs On Windows 17.Q1 and 17.Q2 return driver version 2236.10.	2017-06-11 23:17:46 +02:00
Mai Lavelle	eb293f59f2	Cycles: Pass all buffers to each kernel call for OpenCL Technically not passing all buffers used by a kernel is undefined behavior. We haven't had any issues with this so far on AMD or Nvidia, but it's known to be a problem with Intel and we received a report from AMD that this is a problem on newer hardware, so we need to make this change at some point. Unfortunately there a cost to being correct, about 5% for the benchmark scenes. For low sample counts it's even worse, I've seen up to 50% slowdown. For the latter case I think adjusting tile updating logic can help, but not sure what that would look like yet (it would be just a few lines change however).	2017-06-10 04:08:49 -04:00
Hristo Gueorguiev	1f0998baa7	Cycles: Blacklist unsupported OpenCL devices Due to various driver issues with AMD GCN 1 cards we can no longer support these GPUs. This patch makes them unavailable to select for Cycles rendering. GCN cards 2 and higher are still supported. Please use the most recent drivers available to ensure proper functionality. See here for a list to check which GPUs are supported: https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units	2017-06-10 03:51:18 -04:00
Lukas Stockner	705c43be0b	Cycles Denoising: Merge outlier heuristic and confidence interval test The previous outlier heuristic only checked whether the pixel is more than twice as bright compared to the 75% quantile of the 5x5 neighborhood. While this detected fireflies robustly, it also incorrectly marked a lot of legitimate small highlights as outliers and filtered them away. This commit adds an additional condition for marking a pixel as a firefly: In addition to being above the reference brightness, the lower end of the 3-sigma confidence interval has to be below it. Since the lower end approximates how low the true value of the pixel might be, this test separates pixels that are supposed to be very bright from pixels that are very bright due to random fireflies. Also, since there is now a reliable outlier filter as a preprocessing step, the additional confidence interval test in the reconstruction kernel is no longer needed.	2017-06-09 03:46:11 +02:00
Sergey Sharybin	78c0f09d4f	Cycles: Cleanup, indentation	2017-06-08 12:03:08 +02:00
Sergey Sharybin	38a2bf665b	Cycles: Cleanup, style and unused arguments - Some arguments were inapproriatry tagged as unused using (void)foo semantic. Only use such semantic in tricky casses, when something needs to be ignored in release builds or something is dependent on tricky ifndef policy. For rest of the cases just use void foo(int /bar*/) semantic, which ensures variable is not used. Solves confusion and code running out of sync with later development. - Used proper unused semantic to some arguments. - Added braces to make code easier to follow, tricky indentation with ifdef, uh.	2017-05-20 05:21:27 -07:00
Lukas Stockner	740cd28748	Cycles Denoising: Add more robust outlier heuristic to avoid artifacts Extremely bright pixels in the rendered image cause the denoising algorithm to produce extremely noticable artifacts. Therefore, a heuristic is needed to exclude these pixels from the filtering process. The new approach calculates the 75% percentile of the 5x5 neighborhood of each pixel and flags the pixel if it is more than twice as bright. During the reconstruction process, flagged pixels are skipped. Therefore, they don't cause any problems for neighboring pixels, and the outlier pixels themselves are replaced by a prediction of their actual value based on their feature pass values and the neighboring pixels. Therefore, the denoiser now also works as a smarter despeckling filter that uses a more accurate prediction of the pixel instead of a simple average. This can be used even if denoising isn't wanted by setting the denoising radius to 1.	2017-05-18 21:55:56 +02:00
Lukas Stockner	43b374e8c5	Cycles: Implement denoising option for reducing noise in the rendered image This commit contains the first part of the new Cycles denoising option, which filters the resulting image using information gathered during rendering to get rid of noise while preserving visual features as well as possible. To use the option, enable it in the render layer options. The default settings fit a wide range of scenes, but the user can tweak individual settings to control the tradeoff between a noise-free image, image details, and calculation time. Note that the denoiser may still change in the future and that some features are not implemented yet. The most important missing feature is animation denoising, which uses information from multiple frames at once to produce a flicker-free and smoother result. These features will be added in the future. Finally, thanks to all the people who supported this project: - Google (through the GSoC) and Theory Studios for sponsoring the development - The authors of the papers I used for implementing the denoiser (more details on them will be included in the technical docs) - The other Cycles devs for feedback on the code, especially Sergey for mentoring the GSoC project and Brecht for the code review! - And of course the users who helped with testing, reported bugs and things that could and/or should work better!	2017-05-07 14:40:58 +02:00
Hristo Gueorguiev	b9fda4480f	Cycles: Show samples progress for OpenCL split kernel	2017-05-05 13:37:21 +02:00
Mai Lavelle	d187014675	Cycles: Remove extra clFinish from driver workaround These were causing problems with Nvidia OpenCL.	2017-05-02 14:26:46 -04:00
Hristo Gueorguiev	e91dc3a97c	Cycles: use safe compiler flags for OpenCL. Using -cl-fast-relaxed-math assumes no NaN/Inf values in any expression. This causes problems on overflow, division by zero, square root of negative number. Comparisons with NaN or infinite value are affected as well. This patch causes <2% slowdown on benchmark scenes. Fix T50985: Rendering volume scatter with GPU OpenCL comes to an halt after a few seconds	2017-04-25 20:10:51 +02:00
Sergey Sharybin	f970e859cf	Cycles: Cleanup, style	2017-04-18 11:39:21 +02:00
Sergey Sharybin	9539cfacca	Cycles: Apparently board name could be an empty string	2017-04-10 15:31:21 +02:00
Sergey Sharybin	867d311307	Cycles: Fix warning with MSVC	2017-04-07 18:28:38 +02:00
Mai Lavelle	5b45fff136	Cycles: Add missing flush	2017-04-07 06:06:08 -04:00

1 2

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

84 Commits