blender-archive

Archived

Author	SHA1	Message	Date
Brecht Van Lommel	85ad248c36	Code cleanup: fix warning and improve terminology.	2017-08-12 13:18:05 +02:00
Sergey Sharybin	2e25754ecd	Cycles: Clarify new argument in PathRadiance	2017-08-11 13:49:50 +02:00
Sergey Sharybin	bd069a89aa	Fix T52229: Shadow Catcher artifacts when under transparency Added some extra tirckery to avoid background being tinted dark with transparent surface. Maybe a bit hacky, but seems to work fine.	2017-08-11 13:49:50 +02:00
Sergey Sharybin	176ad9ecdd	Cycles: Remove ulong usage This is a bit confusing, especially when one mixes OpenCL code where ulong equals to uint64_t with CPU side code where ulong is expected to be something else from the naming. This commit makes it so we use explicit name, common on all platforms.	2017-08-09 14:08:58 +02:00
Sergey Sharybin	c961737d0f	Cycles: Fix compilation error of filter kernels on 32 bit Windows We don't enable global SSE optimizations in regular kernel, and we keep those disabled on Linux 32bit. One possible workaround would be to pass arguments by ccl_ref, but that is quite a few of code which better be done accurately.	2017-08-08 22:01:17 +02:00
Sergey Sharybin	19d19add1e	Cycles: Cleanup, de-duplicate function parameter list Was only needed to sue const reference on CPU. Now it is done using ccl_ref.	2017-08-08 15:27:25 +02:00
Sergey Sharybin	fd397a7d28	Cycles: Add utility macro ccl_ref It is defined to & for CPU side compilation, and defined to an empty for any GPU platform. The idea here is to use this macro instead of #ifdef block with bunch of duplicated lines just to make it so CPU code is efficient. Eventually we might switch to references on CUDA as well, but that would require some intensive testing.	2017-08-08 15:27:25 +02:00
Mai Lavelle	ec8ae4d5e9	Cycles: Pack kernel textures into buffers for OpenCL Image textures were being packed into a single buffer for OpenCL, which limited the amount of memory available for images to the size of one buffer (usually 4gb on AMD hardware). By packing textures into multiple buffers that limit is removed, while simultaneously reducing the number of buffers that need to be passed to each kernel. Benchmarks were within 2%. Fixes T51554. Differential Revision: https://developer.blender.org/D2745	2017-08-08 07:12:04 -04:00
Sergey Sharybin	451ccf7396	Cycles: Cleanup, move curve intersection functions to own file This way curve file becomes much shorter and it's also easier to write a benchmark application to check performance before/after future changes.	2017-08-07 20:53:30 +02:00
Sergey Sharybin	77a7a7f455	Cycles: Cleanup, trailign whitespace	2017-08-07 20:53:30 +02:00
Sergey Sharybin	95fe9b2617	Cycles: Cleanup, remove bvh prefix from curve functions Those are nothing to do with BVH, and can be used separately.	2017-08-07 20:53:30 +02:00
Sergey Sharybin	a4bbce8949	Cycles: Fix compilation error on NVidia OpenCL after recent refactor Still need to verify this is proper thing to do for AMD OpenCL. At least now i can compile OpenCL kernel on my laptop with sm21 card.	2017-08-07 20:52:24 +02:00
Brecht Van Lommel	fc38276d74	Fix Cycles shadow catcher objects influencing each other. Since all the shadow catchers are already assumed to be in the footage, the shadows they cast on each other are already in the footage too. So don't just let shadow catchers skip self, but all shadow catchers. Another justification is that it should not matter if the shadow catcher is modeled as one object or multiple separate objects, the resulting render should be the same. Differential Revision: https://developer.blender.org/D2763	2017-08-07 17:54:26 +02:00
Sergey Sharybin	580741b317	Cycles: Cleanup, space after keyword	2017-08-07 14:47:51 +02:00
Brecht Van Lommel	ee77c1e917	Code refactor: use float4 instead of intrinsics for CPU denoise filtering. Differential Revision: https://developer.blender.org/D2764	2017-08-07 14:01:24 +02:00
Brecht Van Lommel	a24fbf3323	Code refactor: add, remove, optimize various SSE functions. * Remove some unnecessary SSE emulation defines. * Use full precision float division so we can enable it. * Add sqrt(), sqr(), fabs(), shuffle variations, mask(). * Optimize reduce_add(), select(). Differential Revision: https://developer.blender.org/D2764	2017-08-07 14:01:24 +02:00
Brecht Van Lommel	a8cc0d707e	Code refactor: split defines into separate header, changes to SSE type headers. I need to use some macros defined in util_simd.h for float3/float4, to emulate SSE4 instructions on SSE2. But due to issues with order of header includes this was not possible, this does some refactoring to make it work. Differential Revision: https://developer.blender.org/D2764	2017-08-07 14:01:24 +02:00
Brecht Van Lommel	2a74f36dac	Fix Cycles CUDA adaptive megakernel build error.	2017-08-07 00:27:08 +02:00
Brecht Van Lommel	45dcd20ca9	Cycles: CUDA split performance tweaks, still far from megakernel. On Pabellon, 25.8s mega, 35.4s split before, 32.7s split after.	2017-08-05 14:32:59 +02:00
Brecht Van Lommel	cd023b6cec	Cycles: remove min bounces, modify RR to terminate less. Differential Revision: https://developer.blender.org/D2766	2017-08-04 23:11:03 +02:00
Sergey Sharybin	a280697e77	Cycles: Support "precompiled" headers in include expansion algorithm The idea here is that it is possible to mark certain include statements as "precompiled" which means all subsequent includes of that file will be replaced with an empty string. This is a way to deal with tricky include pattern happening in single program OpenCL split kernel which was including bunch of headers about 10 times. This brings preprocessing time from ~1sec to ~0.1sec on my laptop.	2017-08-02 20:59:19 +02:00
Brecht Van Lommel	9e929c911e	Fix Cycles multi scatter GGX different render results with Clang and GCC. The order of evaluation of function arguments is undefined, and the order was reversed between these compilers. This was causing regressions tests to give different results between Linux and macOS.	2017-07-23 23:25:12 +02:00
Brecht Van Lommel	e982ebd6d4	Fix T52152: allow zero roughness for Cycles principled BSDF, don't clamp.	2017-07-22 23:58:51 +02:00
Brecht Van Lommel	ec831ee7d1	Fix Cycles denoising NaNs with a 1 sample renders. This was causing different render results with different compilers. We can't do much useful with 1 sample, but better for debugging.	2017-07-22 23:58:51 +02:00
Brecht Van Lommel	3b12a71972	Fix T52125: principled BSDF missing with macOS OpenCL.	2017-07-20 15:15:43 +02:00
Stefan Werner	c1ca3c8038	Cycles: fixed the SM_2x CUDA kernel build that I broke in my previous commit	2017-07-20 13:28:34 +02:00
Stefan Werner	4bc6faf9c8	Fix T52107: Color management difference when using multiple and different GPUs together This commit unifies the flattened texture slot names for bindless and regular CUDA textures. Texture indices are now identical across all CUDA architectures, where before Fermi used different indices, which lead to problems when rendering on multi-GPU setups mixing Fermi with newer hardware.	2017-07-20 10:03:27 +02:00
Sergey Sharybin	5f35682f3a	Fix T52021: Shadow catcher renders wrong when catcher object is behind transparent object Tweaked the path radiance summing and alpha to accommodate for possible contribution of light by transparent surface bounces happening prior to shadow catcher intersection. This commit will change the way how shadow catcher results looks when was behind semi transparent object, but the old result seemed to be fully wrong: there were big artifacts when alpha-overing the result on some actual footage.	2017-07-18 09:46:21 +02:00
Sergey Sharybin	d8906f30d3	Cycles: Remove meaningless camera ray check In branched path tracing main loop is always a camera ray, with varying number of transparent bounces.	2017-07-18 09:27:36 +02:00
Mai Lavelle	1f933c94a7	Cycles: Fix comparison in principled BSDF Could have lead to black pixels.	2017-07-11 23:41:22 -04:00
Brecht Van Lommel	29ec0b1162	Fix T52027: OSL getattribute() crash, when optimizer calls it before rendering.	2017-07-11 22:39:51 +02:00
Lukas Stockner	15fd758bd6	Fix T51950: Abnormally long Cycles OpenCL GPU render times with certain panoramic camera settings The problem here was that when a "invalid" path is generated by the panoramic camera, it was tagged as RAY_TO_REGENERATE with the intention of generating a new path in kernel_buffer_update. However, since that state was not handled in kernel_queue_enqueue, kernel_buffer_update did not process the path which resulted in an infinite loop.	2017-07-03 18:26:19 +02:00
Brecht Van Lommel	29c8c50442	Fix T51956: color noise with principled sss, radius 0 and branched path.	2017-07-02 19:21:08 +02:00
Brecht Van Lommel	52b9516e03	Fix principled BSDF incorrectly missing subsurface component with base color black.	2017-07-02 18:22:24 +02:00
Lukas Stockner	1f3fd8e60a	Fix T51909: Cycles: Uninitialized closure normals for the Hair BSDF As the title says, the normal wasn't set for the Hair BSDF because it wasn't needed before. However, the denoiser uses it to store the feature passes, so it needs to be set now.	2017-06-28 21:32:02 +02:00
Lukas Stockner	1979176088	Cycles: Fix excessive sampling weight of glossy Principled BSDF components If there was any specularity in the Principled BSDF, it would get a sampling weight of one regardless of its actual impact. This commit makes Cycles estimate the contribution of the component and adjust the weighting accordingly, which greatly improves the noise characteristics of the Principled BSDF in many cases. Note that this commit might slightly change the brightness of areas when using MultiGGX and high roughnesses, but the new brightness is more accurate and closer to the result of Branched Path Tracing. See T51836 for details. Differential Revision: https://developer.blender.org/D2677	2017-06-22 00:09:56 +02:00
Lukas Stockner	8cb741a598	Fix T51836: Cycles: Fix incorrect PDF approximations of the MultiGGX closures The PDF of the MultiGGX sampling is approximated by the singlescattering GGX term as well as a scaled diffuse term that makes up for the energy in the multiscattering component that's missed by GGX. However, there were two problems with the glossy terms: The diffuse term missed a normalization factor, and the singlescattering term was not properly scaled down based on the albedo estimate. The glass term was completely wrong and has been rewritten. It uses the fresnel factor to weight reflection vs. refraction and uses the glossy MultiGGX model for reflection. For refraction, the correct singlescattering term is now used, and a new albedo approximation is used that was derived by evaluating GGX albedo for roughnesses from 0 to 1 and IORs from 1 to 3 and fitting numerical approximations to it. The resulting model has a mean relative error of 9e-5, but could probably be simplified without losing noticable accuracy in the final render. The improved PDFs help with glossy highlights (due to better light sampling vs. closure sampling MIS) and fix the situation described in T51836 where mixing MultiGGX with other closures (as it happens in e.g. the Principled BSDF) causes incorrect darkening.	2017-06-22 00:09:56 +02:00
Brecht Van Lommel	14ea0c5fcc	Fix T51849: change Cycles clearcoat gloss to roughness. This is compatible with UE4 and more consistent with specular and transmission roughness, even if it deviates from the original Disney BRDF.	2017-06-21 19:55:20 +02:00
Sergey Sharybin	64aa0cff89	Cycles: Fix typo in comment	2017-06-14 09:54:07 +02:00
Sergey Sharybin	40c04dd649	Cycles: Cleanup, indentation	2017-06-13 10:28:38 +02:00
Sergey Sharybin	0aa5431998	Cycles: Fix compilation error of OpenCL mega kernel Was some mismatch in address space. Seems to be caused by recent additions. Additionally, moved decoupled ray marching functions under ifdef, so they don't try to use malloc() functions. Thanks Mai for testing the patch!	2017-06-13 10:26:45 +02:00
Lukas Stockner	558bea2252	Cycles Denoising: Add more failsafes for invalid pixels Now, when there is no usable neighboring pixel for denoising, the noisy value is preserved instead of producing a NaN. Also, negative results are clamped to zero. Note that there are just workarounds that don't fix the underlying problems, but these issues are very rare and I'm not sure if it's even possible to fix the underlying problems without introducing a significant slowdown or quality decrease in other situations. Because of that and since 2.79 is happening very soon, I just went for these workarounds for now.	2017-06-11 01:51:39 +02:00
Sergey Sharybin	e097fc4aa6	Cycles: Selectively include denoising in kernel	2017-06-10 04:45:13 -04:00
Mai Lavelle	eb293f59f2	Cycles: Pass all buffers to each kernel call for OpenCL Technically not passing all buffers used by a kernel is undefined behavior. We haven't had any issues with this so far on AMD or Nvidia, but it's known to be a problem with Intel and we received a report from AMD that this is a problem on newer hardware, so we need to make this change at some point. Unfortunately there a cost to being correct, about 5% for the benchmark scenes. For low sample counts it's even worse, I've seen up to 50% slowdown. For the latter case I think adjusting tile updating logic can help, but not sure what that would look like yet (it would be just a few lines change however).	2017-06-10 04:08:49 -04:00
Mai Lavelle	6238214159	Cycles: Faster split branched path tracing by sharing samples with inactive threads Unlike regular path tracing, branched path tracing is usually used with lower sample counts, at least for primary rays. This means that are less samples for the GPU to work on in parallel and rendering is slower. As there is less work overall there is also more inactive threads during rendering with BPT. This patch makes use of those inactive rays to render branched samples in parallel with other samples. Each thread that is preparing for a branched sample will attempt to find an inactive thread and if one is found the state for the sample is copied to that thread. Potentially, if there are enough inactive threads, 100s of branched samples could be generated from the same originating thread and ran in parallel giving large speed ups. Gives 70% faster render for pavillion midday scene. 20-60% faster on BMW with car paint replaced with SSS/volumes.	2017-06-10 04:08:49 -04:00
Mai Lavelle	32299d32e7	Cycles: Modify path_radiance_accum_sample to use atomics for split kernel Samples ran in parallel need a safe way to accumulate their results with the results of other threads.	2017-06-10 04:08:02 -04:00
Mai Lavelle	6995b50e41	Cycles: Add function to dequeue a ray	2017-06-10 03:51:18 -04:00
Mai Lavelle	ea846a4dfc	Cycles: Add kernel to enqueue inactive rays The queue will be used to make reuse of inactive threads to keep the GPU more busy.	2017-06-10 03:51:18 -04:00
Lukas Stockner	0a898e2405	Cleanup Cycles Denoising platform-specific defines	2017-06-09 22:38:16 +02:00
Lukas Stockner	7dc51f87ed	Cycles Denoising: Speedup reconstruction by skipping near-zero weights	2017-06-09 22:38:16 +02:00

... 2 3 4 5 6 ...

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

1960 Commits