blender-archive

Archived

Author	SHA1	Message	Date
Sergey Sharybin	9f63cbf4a7	Fix T45333: Volume Scatter crash blender	2015-07-13 18:54:26 +05:00
Sergey Sharybin	a95b0e0e9d	Cycles: Another fix for OSX, sm_50 experimental actually also fails to compile Didn't notice it originally because compilation was threaded.	2015-06-20 19:40:23 +02:00
Sergey Sharybin	5e2835037a	Cycles: Tweak to previous commit, experimental sm_52 works on Linux but not OSX	2015-06-20 19:01:24 +02:00
Sergey Sharybin	34d665a4a4	Cycles: Un-inline triangle_intersect_precalc() on Apple OpenCL This gives quite the same problems as experimental CUDA kernels and for until it's found a root cause of the problem we'd just explicitly uninline the function.	2015-06-20 18:00:30 +02:00
Sergey Sharybin	845854959f	Cycles: Cleanup, make it more obvious which platform requires workaround for triangle intersection Should be no functional changes.	2015-06-20 17:01:21 +02:00
Sergey Sharybin	34c3beb339	Cycles: Fix missing node distance update when only two child intersected in QBVH	2015-06-12 10:06:46 +02:00
Sergey Sharybin	596eadf0e1	Cycles: Add debug pass which shows number of instance pushes during camera ray intersection TODO: We might want to refactor debug passes into PASS_DEBUG and some debug_type (similar to Blender's side passes) to avoid issue of running out of bits.	2015-06-12 00:12:03 +02:00
Sergey Sharybin	b3cc602adc	Cycles: Remove meaningless debug traversal steps increment from QBVH volume code	2015-06-11 23:54:57 +02:00
Sergey Sharybin	2ab909a88c	Cycles: Make experimental kernel build option more generic Previously it was explicitly mentioning it's NVidia kernel related option, but in fact it's also handy for the OpenCL kernel.	2015-05-15 13:22:47 +05:00
Sergey Sharybin	3d3d805b64	Cycles: Prepare code for OpenCL camera/motion blur The kernels are now compiling just fine, but there're some issues during rendering. This is still to be investigated.	2015-05-14 18:48:56 +05:00
Sergey Sharybin	5a63edb929	Cycles: Use special _auto versions of transform function in motion blur code Doing this as a separate commit so it's easier to revert in the future, once OpenCL 2.0 is becoming our requirement.	2015-05-14 18:48:56 +05:00
Sergey Sharybin	79aa50dc53	Cycles: Enable hair for split kernels when using Intel or NVidia drivers Apart from simply enabling this features needed changes to the code were done. Technical change, replacing SD access from "simple" structure to SOA.	2015-05-14 18:48:56 +05:00
Sergey Sharybin	583fd3af65	Cycles: Fix typo in global space version of normal transform It was using direction transform, which is obviously wrong.	2015-05-10 00:53:32 +05:00
George Kyriazis	7f4479da42	Cycles: OpenCL kernel split This commit contains all the work related on the AMD megakernel split work which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely someone else which we're forgetting to mention. Currently only AMD cards are enabled for the new split kernel, but it is possible to force split opencl kernel to be used by setting the following environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1. Not all the features are supported yet, and that being said no motion blur, camera blur, SSS and volumetrics for now. Also transparent shadows are disabled on AMD device because of some compiler bug. This kernel is also only implements regular path tracing and supporting branched one will take a bit. Branched path tracing is exposed to the interface still, which is a bit misleading and will be hidden there soon. More feature will be enabled once they're ported to the split kernel and tested. Neither regular CPU nor CUDA has any difference, they're generating the same exact code, which means no regressions/improvements there. Based on the research paper: https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf Here's the documentation: https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit Design discussion of the patch: https://developer.blender.org/T44197 Differential Revision: https://developer.blender.org/D1200	2015-05-09 19:52:40 +05:00
Thomas Dinges	4eab0e72b3	Cleanup: Update some comments and add ToDo.	2015-04-29 23:56:46 +02:00
Thomas Dinges	b3def11f5b	Cycles: Record all possible volume intersections for SSS and camera checks This replaces sequential ray moving followed with scene intersection with single BVH traversal, which gives us all possible intersections. Only implemented for CPU, due to qsort and a bigger memory usage on GPU which we rather avoid. GPU still uses the regular bvh volume intersection code, while CPU now uses the new code. This improves render performance for scenes with: a) Camera inside volume mesh b) SSS mesh intersecting a volume mesh/domain In simple volume files (not much geometry) performance is roughly the same (slightly faster). In files with a lot of geometry, the performance increase is larger. bmps.blend with a volume shader and camera inside the mesh, it renders ~10% faster here. Patch by Sergey and myself. Differential Revision: https://developer.blender.org/D1264	2015-04-29 23:31:06 +02:00
Sergey Sharybin	ae7d84dbc1	Cycles: Use native saturate function for CUDA This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1) into a single instruction and uses 4 instructions instead. Original patch by @lockal with own modification: Don't make changes outside of the kernel. They don't make any difference anyway and term saturate() has a bit different meaning outside of kernel. This gives around 2% of speedup in Barcelona file, but in more complex shader setups with lots of math nodes with clamping speedup could be much nicer. Subscribers: dingto Projects: #cycles Differential Revision: https://developer.blender.org/D1224	2015-04-28 00:38:32 +05:00
Sergey Sharybin	828abaf11c	Cycles: Split BVH nodes storage into inner and leaf nodes This way we can get rid of inefficient memory usage caused by BVH boundbox part being unused by leaf nodes but still being allocated for them. Doing such split allows to save 6 of float4 values for QBVH per leaf node and 3 of float4 values for regular BVH per leaf node. This translates into following memory save using 01.01.01.G rendered without hair: Device memory size Device memory peak Global memory peak Before the patch: 4957 5051 7668 With the patch: 4467 4562 7332 The measurements are done against current master. Still need to run speed tests and it's hard to predict if it's faster or not: on the one hand leaf nodes are now much more coherent in cache, on the other hand they're not so much coherent with regular nodes anymore. Reviewers: brecht, juicyfruit Subscribers: venomgfx, eyecandy Differential Revision: https://developer.blender.org/D1236	2015-04-20 17:29:51 +05:00
Sergey Sharybin	bf11e362c5	Fix T44046: Cycles speed regression in 2.74 (CPU only) Issue was caused by MSVC not being able to optimize some code out in the same way as GCC/Clang does, so now that parts of code are explicitly unfolded in order to help compilers out. This makes speed loss much less drastic on my laptop. That's probably as good as we can do with MSVC without investing infinite amount of time looking trying to workaround the optimizer.	2015-04-08 18:47:25 +05:00
Sergey Sharybin	09a746b857	Cycles: Cleanup, typos	2015-04-08 01:15:38 +05:00
Sergey Sharybin	858f54f16e	Cycles: Cleanup, indentation	2015-04-07 22:41:08 +05:00
Sergey Sharybin	e2354e64d2	Cycles: Cleanup, spaces around assignment operator Did some bad spacing in recent commits, better to get rid of those so they does not confuse those who're working on sources.	2015-04-07 00:25:54 +05:00
Sergey Sharybin	ab2d05d958	Fix T44269: Typo in volume_attribute_float:geom_volume.h Was rather harmless typo since we either pass both dx,dy or pass both NULL.	2015-04-05 19:07:45 +05:00
Sergey Sharybin	f1494edf78	Cycles: Make SSS intersection closer to regular triangle intersection	2015-04-01 21:20:04 +05:00
Sergey Sharybin	394b947a50	Cycles: Remove unused direction from triangle intersection functions This argument was unused and got nicely optimized out. But once it starts to be using registers are getting stressed really crazy, causing slow down of render.	2015-04-01 21:08:12 +05:00
Sergey Sharybin	79918e0577	Cycles: Avoid float/int conversion in few places	2015-03-31 19:52:14 +05:00
Sergey Sharybin	7da4c2637d	Cycles: Fix typo in distance heuristic for shadow rays It's not that bad because this typo could only caused not really efficient BVH traversal, causing higher render times. Not as if it was causing render artifacts.	2015-03-31 19:52:14 +05:00
Sergey Sharybin	5ff132182d	Cycles: Code cleanup, spaces around keywords This inconsistency drove me totally crazy, it's really confusing when it's inconsistent especially when you work on both Cycles and Blender sides. Shouldn;t cause merge PITA, it's whitespace changes only, Git should be able to merge it nicely.	2015-03-28 00:15:15 +05:00
Sergey Sharybin	dce16d57dc	Revert "Fix T43865: Cycles: Watertight rendering produces artifacts on a huge plane" The fix was really flacky, in terms during speed benchmarks i had abort() in the fallback block to be sure it never runs in production scenes, but that affected on the optimization as well. Without this abort there's quite bad slowdown of 5-7% on the renders even tho the Pleucker fallback was never run. This is all weird and for now reverting the change which affects on all the production scenes and will look into alternative fixes for the original issue with precision loss on huge planes. This reverts commit `9489205c5c`.	2015-03-12 18:24:53 +05:00
Sv. Lockal	c8fb488b08	Fix T41066: An actual fix for curve intersection on FMA-enabled CPUs	2015-03-07 16:20:34 +00:00
Sergey Sharybin	9489205c5c	Fix T43865: Cycles: Watertight rendering produces artifacts on a huge plane The issue was caused by numerical instability whrn having ray origin close to a huge triangle, which could have aused bad ray distance check. Watertight Woop intersection isn't really addressing such cases, it's dealing with small triangles far away from the ray origin instead, so it's a bit tricky yo make it working reliably. While we're quite close to the release it's safer to do check in Pleaucker coordinates if ray close to a huge triangle. Likely this additional check combined with some other tweaks to the code doesn't cause measurable slowdown in the scenes tested here. After the release we can play a bit more with this code in order to make it more stable without Pleucker fallback.	2015-03-05 18:55:30 +05:00
Sergey Sharybin	d544bc5cd5	Cycles: Fix embarrassing type remained after getting rid of utility SWAP()	2015-03-04 00:16:21 +05:00
Sergey Sharybin	edb7195f27	Cycles: Bring back distance check in re-intersection From more investigation of the numeric failures in the kernel it appears the check was rather correct. But in theory it;s also needed for the motion triangles.	2015-02-10 19:07:55 +05:00
Sergey Sharybin	298d8681a0	Fix T43596: Refraction BSDF crashes blender on pre-sse4 CPU This is the same issue T43475: SSE4 code is more robust to non-finite values in the ray origin/direction. So for now added a check before doing BVH traversal for pre-SSE4 CPUs. For sure actual root of the issue is a bit different and much more tricky to solve, especially without disturbing render results too much. Still looking into this. In any case, it's kinda fine to have such a check, we might later make it to be a kernel_assert() instead of just a return.	2015-02-10 17:36:05 +05:00
Sergey Sharybin	b83d851901	Cycles: Another attempt to solve 32bit CUDA kernel Previous fix didn't quite work well. For some reason everything worked fine when using native nvcc in 32bit environment, but cross-compiling from 64bit platform it was still running out of memory. For now just made it so all the kernels are slower on 32bit CUDA as a temporary solution. Either it'll be solved in next CUDA releases (by dropped 32bit? =\) or we'll find better workaround.	2015-02-09 16:14:44 +05:00
Sergey Sharybin	da06dab4e5	Cycles: Use pre-aligned triangle vertex coordinates for subsurface intersection This gives small speedup (around 2% in quick tests) for ray scattering.	2015-02-04 14:49:19 +05:00
Sergey Sharybin	432e478f43	Cycles: Further tweaks to T43511 to solve compilation error on 32bit platforms	2015-02-02 22:09:02 +05:00
Sergey Sharybin	31263192bb	Fix T43511: Major slow down with many instanced objects in cycles GPU Slowdown was caused by watertight intersection commit and follow-up workaorund for compiler crash which uninlined utility function which rotates the ray. Now it's only uninlined for sm_50 and sm_52 experimental kernels which are the only ones which failed to compile. Rendering still might be a bit slower but at least shouldn't be that dramatic.	2015-02-02 17:35:57 +05:00
Sergey Sharybin	03cb146afa	Fix T43496: Infinite loop in kernel when using surface attribute for volume The issue was caused bu the optimization in surface attributes for cases when there's only a volume shader used. Some attributes doesn't make sense in that case and were skipped from calculation. However, it is possible that kernel would still try to access them (because of the shader setup etc). Prevented an infinite loop in the kernel now, which should not have much affect on regular renders.	2015-01-31 14:39:19 +05:00
Sergey Sharybin	3f5771475d	Cycles: Don't perform re-intersection if ray distance is zero It is possible that ray distance will be zero which would make intersection refinement return NaN as the refined position which would later lead to all sort of mathematical issues. Don't think there are ways to improve intersection accuracy for such rays so just return original intersection coordinate. This should fix T43475. TODO: Need to look into possible issues in Ashikhmin BSDF which might return zero-length reflected/transmitted ray?	2015-01-31 01:49:48 +05:00
Sergey Sharybin	09ac6cae09	Cycles: Cleanup and optimization comment update	2015-01-17 00:15:47 +05:00
Sergey Sharybin	5719ed1225	Cycles: Add leaf primitives sanity check asserts to the kernel This way we'll notice that leaf splitting didn't happen correct pretty easily in debug builds. There'll be absolutely no impact on release builds.	2015-01-12 15:05:14 +05:00
Sergey Sharybin	bc7ff3c2b4	Cycles: Enable leaf split by primitive type and adopt BVH traversal for this This commit enables BVH leaf nodes split by the primitive type and makes it so BVH traversal code is now aware and benefits from this. As was mentioned in original commit, this change is crucial to be able to do single ray to multiple triangle intersection. But it also appears to give barely visible speedup in some scene. In any case there should be no noticeable slowdown, and this change is what we need to have anyway.	2015-01-12 15:04:52 +05:00
Sergey Sharybin	2a8a56929b	Cycles: Fix unneeded int/float conversion happened in previous commit	2015-01-02 17:21:24 +05:00
Sergey Sharybin	4f2583ee13	Fix T43027: OpenCL kernel compilation broken after QBVH OpenCL apparently does not support templates, so the idea of generic function for swapping is a bit of a failure. Now it is either inlined into the code (in triangle intersection) or has specific implementation for QBVH. This is probably even better, because we can't create QBVH-specific function in util_math anyway.	2015-01-02 14:58:01 +05:00
Sergey Sharybin	7778f0ff20	Cycles: Fix MSVC which desn't like condition to be split by preprocessor	2014-12-29 21:10:37 +05:00
Sergey Sharybin	4088fad6dd	Cycles: Add asserts around BVH stack pushes This way we're kind of safer to troubleshoot possible stack overflow issues.	2014-12-29 14:02:15 +05:00
Sergey Sharybin	40517283ca	Cycles: Bump stack size for QBVH traversal code Traversal now can push up to 2x of nodes to the stack, so need some tweaks to the stack size.	2014-12-29 13:37:18 +05:00
Sergey Sharybin	9c4aba11c9	Cycles: Add some sanity check asserts in the traversal code This way we'll be sure (in debug builds) that regular BVH traversal is not used for QBVH tree (could happen because of mismatch of logic in kernel and render).	2014-12-29 13:35:31 +05:00
Sergey Sharybin	91bbaaa271	Cycles: Fix visibility check for instanced nodes The issue is that only instance node contains proper visibility flags, nodes from instanced BVH are not correct.	2014-12-27 23:33:50 +05:00

1 2 3

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

113 Commits