blender-archive

Archived

Author	SHA1	Message	Date
Brecht Van Lommel	ce3e0afe59	Fix T54001: AMD OpenCL fails with certain resolutions, after recent changes. We should actually be using CL_DEVICE_MEM_BASE_ADDR_ALIGN for sub buffers, previous change in this code was incorrect. Renamed the function now to make the specific purpose of this alignment clear, it's not required for data types in general.	2018-02-05 22:19:49 +01:00
Sergey Sharybin	8e1dd7ed81	Cycles: Remove unneeded include statements Also try to move them from headers to implementation files as much as possible.	2018-01-19 15:19:45 +01:00
Lukas Stockner	fa3d50af95	Cycles: Improve denoising speed on GPUs with small tile sizes Previously, the NLM kernels would be launched once per offset with one thread per pixel. However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown. Therefore, the kernels are now launched in a single call that handles all offsets at once. This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory. On the other hand, of course, the smaller tiles significantly reduce the size of the memory. The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum. I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere. To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now. Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.	2017-11-30 07:37:08 +01:00
Brecht Van Lommel	bd4bea3e98	Cycles: avoid reallocating tile denoising memory many times during render.	2017-11-09 20:28:00 +01:00
Brecht Van Lommel	5801ef71e4	Code refactor: device memory cleanups, preparing for mapped host memory.	2017-11-05 15:22:04 +01:00
Brecht Van Lommel	34fe3f9c06	Code refactor: remove MEM_WRITE_ONLY, always use MEM_READ_WRITE. It's unlikely the driver can do useful optimizations with this, and if we sum multiple samples we are reading from the memory anyway.	2017-10-24 23:53:09 +02:00
Brecht Van Lommel	070a668d04	Code refactor: move more memory allocation logic into device API. * Remove tex_* and pixels_* functions, replace by mem_. Add MEM_TEXTURE and MEM_PIXELS as memory types recognized by devices. * No longer create device_memory and call mem_* directly, always go through device_only_memory, device_vector and device_pixels.	2017-10-24 01:25:19 +02:00
Brecht Van Lommel	aa8b4c5d81	Code refactor: use device_only_memory and device_vector in more places.	2017-10-24 01:25:13 +02:00
Brecht Van Lommel	7ad9333fad	Code refactor: store device/interp/extension/type in each device_memory.	2017-10-24 01:03:59 +02:00
Brecht Van Lommel	57a0cb797d	Code refactor: avoid some unnecessary device memory copying.	2017-10-21 20:58:28 +02:00
Brecht Van Lommel	92611dada6	Fix T53098, T53079: OpenCL world texture errors after recent changes.	2017-10-18 03:13:25 +02:00
Brecht Van Lommel	23098cda99	Code refactor: make texture code more consistent between devices. * Use common TextureInfo struct for all devices, except CUDA fermi. * Move image sampling code to kernels//kernel__image.h files. * Use arrays for data textures on Fermi too, so device_vector<Struct> works.	2017-10-07 14:53:14 +02:00
Brecht Van Lommel	fb99ea79f8	Code refactor: split displace/background into separate kernels, remove luma.	2017-10-05 17:57:58 +02:00
Brecht Van Lommel	12f4538205	Code refactor: use split variance calculation for mega kernels too. There is no significant difference in denoised benchmark scenes and denoising ctests, so might as well make it all consistent.	2017-10-04 21:11:14 +02:00
Mai Lavelle	55d28e604e	Cycles: Proper fix for recent OpenCL image crash Problem was that some code checks to see if device_pointer is null or not and the new allocator wasn't even setting the pointer to anything as it tracks memory location separately. Setting the pointer to non null keeps all users of device_pointer happy.	2017-08-09 04:27:39 -04:00
Sergey Sharybin	99c13519a1	Cycles: More fixes for Windows 32 bit - Apparently MSVC does not support compound literals in C++ (at least by the looks of it). - Not sure how opencl_device_assert was managing to set protected property of the Device class.	2017-08-08 22:32:51 +02:00
Sergey Sharybin	0e57282999	Cycles: Fix compilation error without C++11 Common folks, nobody considered master a C++11 only branch. Such decision is to be done officially and will involve changes in quite a few infrastructure related areas.	2017-08-08 17:02:26 +02:00
Mai Lavelle	ec8ae4d5e9	Cycles: Pack kernel textures into buffers for OpenCL Image textures were being packed into a single buffer for OpenCL, which limited the amount of memory available for images to the size of one buffer (usually 4gb on AMD hardware). By packing textures into multiple buffers that limit is removed, while simultaneously reducing the number of buffers that need to be passed to each kernel. Benchmarks were within 2%. Fixes T51554. Differential Revision: https://developer.blender.org/D2745	2017-08-08 07:12:04 -04:00
Sergey Sharybin	fee7f688c3	Cycles: Fix ambiguity in call of min() function	2017-07-07 10:40:19 +02:00
Mai Lavelle	9c3f1ad003	Cycles: Add artificial memory limit debug option for OpenCL	2017-07-06 05:25:46 -04:00
Mai Lavelle	222b96e5c7	Cycles: Detect out of memory before buffer allocation in OpenCL devices	2017-07-05 20:19:12 -04:00
Mai Lavelle	56dcfcce05	Cycles: Disable baking in mega kernel when not in use to improve build times	2017-06-29 23:07:18 -04:00
Lukas Stockner	705c43be0b	Cycles Denoising: Merge outlier heuristic and confidence interval test The previous outlier heuristic only checked whether the pixel is more than twice as bright compared to the 75% quantile of the 5x5 neighborhood. While this detected fireflies robustly, it also incorrectly marked a lot of legitimate small highlights as outliers and filtered them away. This commit adds an additional condition for marking a pixel as a firefly: In addition to being above the reference brightness, the lower end of the 3-sigma confidence interval has to be below it. Since the lower end approximates how low the true value of the pixel might be, this test separates pixels that are supposed to be very bright from pixels that are very bright due to random fireflies. Also, since there is now a reliable outlier filter as a preprocessing step, the additional confidence interval test in the reconstruction kernel is no longer needed.	2017-06-09 03:46:11 +02:00
Sergey Sharybin	38a2bf665b	Cycles: Cleanup, style and unused arguments - Some arguments were inapproriatry tagged as unused using (void)foo semantic. Only use such semantic in tricky casses, when something needs to be ignored in release builds or something is dependent on tricky ifndef policy. For rest of the cases just use void foo(int /bar*/) semantic, which ensures variable is not used. Solves confusion and code running out of sync with later development. - Used proper unused semantic to some arguments. - Added braces to make code easier to follow, tricky indentation with ifdef, uh.	2017-05-20 05:21:27 -07:00
Lukas Stockner	740cd28748	Cycles Denoising: Add more robust outlier heuristic to avoid artifacts Extremely bright pixels in the rendered image cause the denoising algorithm to produce extremely noticable artifacts. Therefore, a heuristic is needed to exclude these pixels from the filtering process. The new approach calculates the 75% percentile of the 5x5 neighborhood of each pixel and flags the pixel if it is more than twice as bright. During the reconstruction process, flagged pixels are skipped. Therefore, they don't cause any problems for neighboring pixels, and the outlier pixels themselves are replaced by a prediction of their actual value based on their feature pass values and the neighboring pixels. Therefore, the denoiser now also works as a smarter despeckling filter that uses a more accurate prediction of the pixel instead of a simple average. This can be used even if denoising isn't wanted by setting the denoising radius to 1.	2017-05-18 21:55:56 +02:00
Lukas Stockner	43b374e8c5	Cycles: Implement denoising option for reducing noise in the rendered image This commit contains the first part of the new Cycles denoising option, which filters the resulting image using information gathered during rendering to get rid of noise while preserving visual features as well as possible. To use the option, enable it in the render layer options. The default settings fit a wide range of scenes, but the user can tweak individual settings to control the tradeoff between a noise-free image, image details, and calculation time. Note that the denoiser may still change in the future and that some features are not implemented yet. The most important missing feature is animation denoising, which uses information from multiple frames at once to produce a flicker-free and smoother result. These features will be added in the future. Finally, thanks to all the people who supported this project: - Google (through the GSoC) and Theory Studios for sponsoring the development - The authors of the papers I used for implementing the denoiser (more details on them will be included in the technical docs) - The other Cycles devs for feedback on the code, especially Sergey for mentoring the GSoC project and Brecht for the code review! - And of course the users who helped with testing, reported bugs and things that could and/or should work better!	2017-05-07 14:40:58 +02:00
Hristo Gueorguiev	e91dc3a97c	Cycles: use safe compiler flags for OpenCL. Using -cl-fast-relaxed-math assumes no NaN/Inf values in any expression. This causes problems on overflow, division by zero, square root of negative number. Comparisons with NaN or infinite value are affected as well. This patch causes <2% slowdown on benchmark scenes. Fix T50985: Rendering volume scatter with GPU OpenCL comes to an halt after a few seconds	2017-04-25 20:10:51 +02:00
Sergey Sharybin	0579eaae1f	Cycles: Make all #include statements relative to cycles source directory The idea is to make include statements more explicit and obvious where the file is coming from, additionally reducing chance of wrong header being picked up. For example, it was not obvious whether bvh.h was refferring to builder or traversal, whenter node.h is a generic graph node or a shader node and cases like that. Surely this might look obvious for the active developers, but after some time of not touching the code it becomes less obvious where file is coming from. This was briefly mentioned in T50824 and seems @brecht is fine with such explicitness, but need to agree with all active developers before committing this. Please note that this patch is lacking changes related on GPU/OpenCL support. This will be solved if/when we all agree this is a good idea to move forward. Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner Reviewed By: lukasstockner97, maiself, nirved, dingto Subscribers: brecht Differential Revision: https://developer.blender.org/D2586	2017-03-29 13:41:11 +02:00
Sergey Sharybin	7780a108b3	Cycles: Simplify some extra OpenCL query code	2017-03-21 12:01:03 +01:00
Mai Lavelle	96868a3941	Fix T50888: Numeric overflow in split kernel state buffer size calculation Overflow led to the state buffer being too small and the split kernel to get stuck doing nothing forever.	2017-03-11 05:39:28 -05:00
Sergey Sharybin	97c4c2689f	Cycles: Make it more obvious message which initialization failed	2017-03-08 13:57:21 +01:00
Sergey Sharybin	ecfbfe478b	Cycles: Log which device kernels are being loaded for	2017-03-08 12:33:51 +01:00
Mai Lavelle	64751552f7	Cycles: Fix indentation	2017-03-08 01:31:32 -05:00
Mai Lavelle	b78e543af9	Cycles: Add names to buffer allocations This is to help debug and track memory usage for generic buffers. We have similar for textures already since those require a name, but for buffers the name is only for debugging proposes.	2017-03-08 01:24:55 -05:00
Mai Lavelle	230c00d872	Cycles: OpenCL split kernel refactor This does a few things at once: - Refactors host side split kernel logic into a new device agnostic class `DeviceSplitKernel`. - Removes tile splitting, a new work pool implementation takes its place and allows as many threads as will fit in memory regardless of tile size, which can give performance gains. - Refactors split state buffers into one buffer, as well as reduces the number of arguments passed to kernels. Means there's less code to deal with overall. - Moves kernel logic out of OpenCL kernel files so they can later be used by other device types. - Replaced OpenCL specific APIs with new generic versions - Tiles can now be seen updating during rendering	2017-03-08 00:52:41 -05:00
Mai Lavelle	520b53364c	Cycles: Add OpenCL kernel for zeroing memory buffers Transferring memory to the device was very slow and there's really no need when only zeroing a buffer.	2017-03-08 00:52:41 -05:00
Mai Lavelle	0f56f7a811	Cycles: Allow device_memory to be used directly This is useful for when theres no host side memory attched to the buffer	2017-03-08 00:52:41 -05:00
Sergey Sharybin	48997d2e40	Cycles: Cleanup, style	2016-10-24 12:26:12 +02:00
Lukas Stockner	f7ce482385	Cycles: Fix another OpenCL logging issue Previously an error message would be printed whenever the OpenCL build produced output. However, some frameworks seem to print extra information even if the build succeeded, so now the actual returned error is checked as well. When --debug-cycles is activated, the build output will always be printed, otherwise it only gets printed if there was an error.	2016-10-21 02:49:00 +02:00
Lukas Stockner	cd843409d3	Fix T49630: Cycles: Swapped shader and bake kernels The problem here was, as the title says, that the two kernels were swapped. Since shader evaluation is only used for building the samling map when World MIS is enabled, rendering without it would still work fine, although baking also was broken.	2016-10-17 12:28:01 +02:00
Lukas Stockner	d5dd12e56c	Cycles: Improve OpenCL kernel compilation logging The previous refactor changed the code to use a separate logging mechanism to support multithreaded compilation. However, since that's not supported by any frameworks yes, it just resulted in bad logging behaviour. So, this commit changes the logging to go diectly to stdout/stderr once again by default.	2016-10-17 11:51:18 +02:00
Lukas Stockner	9ea71bc674	Cycles: Split device_opencl.cpp into multiple files for easier maintenance There are no user-visible changes, just some internal restructuring. Differential Revision: https://developer.blender.org/D2231	2016-10-09 15:49:50 +02:00

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

42 Commits