Commit Graph

93 Commits

Author SHA1 Message Date
12b7850d4f Cycles: Fix for transmissive microfacet sampling
This is an alternate fix for T40964 which resolves bad handling of
caustics reported in T45609.

There were too much transmission rays being discarded by the original
fix, which caused by caustic light being totally disabled. There is
still some room for investigation why exactly original paper didn't
work that well, could be caused by the way how the pdf is calculated.

In any case current results seems rather correct now.
2015-07-31 13:46:58 +02:00
6a0a205cb4 Cycles: Simplify volume_phase_eval().
This simplification is safe, as the call to volume_phase_eval() is guarded behind a CLOSURE_IS_PHASE check, which is equal to
CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID. I don't think we will add more phase functions anytime soon, if at all.
2015-06-11 15:18:33 +02:00
7f4479da42 Cycles: OpenCL kernel split
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.

Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.

Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.

This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.

More feature will be enabled once they're ported to the split kernel and
tested.

Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.

Based on the research paper:

  https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf

Here's the documentation:

  https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit

Design discussion of the patch:

  https://developer.blender.org/T44197

Differential Revision: https://developer.blender.org/D1200
2015-05-09 19:52:40 +05:00
ae7d84dbc1 Cycles: Use native saturate function for CUDA
This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1)
into a single instruction and uses 4 instructions instead.

Original patch by @lockal with own modification:

  Don't make changes outside of the kernel. They don't make any difference
  anyway and term saturate() has a bit different meaning outside of kernel.

This gives around 2% of speedup in Barcelona file, but in more complex shader
setups with lots of math nodes with clamping speedup could be much nicer.

Subscribers: dingto

Projects: #cycles

Differential Revision: https://developer.blender.org/D1224
2015-04-28 00:38:32 +05:00
bc160d8a85 Cleanup: Code style. 2015-04-26 00:42:26 +02:00
e2354e64d2 Cycles: Cleanup, spaces around assignment operator
Did some bad spacing in recent commits, better to get rid of those so
they does not confuse those who're working on sources.
2015-04-07 00:25:54 +05:00
a9bb8d8a73 Cycles: de-duplicate fast/approximate erf function calculation
Our own implementation is in fact the same performance as in fast_math from
OpenShadingLanguage, but implementation from fast_math is using explicit madd
function, which increases chance of compiler deciding to use intrinsics.
2015-04-06 12:49:44 +05:00
b06962fcfe Cycles: Avoid using lookup table for Beckmann slopes on GPU
This patch is based on some work done in D788 and re-formulation from Beckmann
implementation in OpenShadingLanguage.

Skipping texture lookup helps a lot on GPUs where it's more expensive to access
texture memory than to do some extra calculation in threads.

CPU code still uses lookup-table based approach since this seems to be still
faster (at least on computers i've got access to).

This change gives about 2% speedup on BMW scene with GTX560TI.
2015-04-05 19:07:45 +05:00
252b36ce77 Cycles: Remove unused Beckmann slope sampling code
It did not preserve stratification too well and lookup-table approach was
working much better. There are now also some more interesting forumlation
from Wenzel and OpenShadingLanguage which should work better than old code.
2015-04-05 19:07:44 +05:00
af399884e1 Fix T44113: Ashikhmin-Shirley distribution of glossy shader at 0 roughness causes artifacts when background uses MIS
Was a division by zero error, solved in the same way as beckmann/ggx
deals with small roughness values.
2015-04-01 14:21:21 +05:00
5ff132182d Cycles: Code cleanup, spaces around keywords
This inconsistency drove me totally crazy, it's really confusing
when it's inconsistent especially when you work on both Cycles and
Blender sides.

Shouldn;t cause merge PITA, it's whitespace changes only, Git should
be able to merge it nicely.
2015-03-28 00:15:15 +05:00
87cff57207 Fix T44123: Cycles SSS renders black in recent builds
Issue was introduced in 01ee21f where i didn't notice *_setup()
function only doing partial initialization, and some of parameters
are expected to be initialized by callee function.

This was hitting only some setups, so tests with benchmark scenes
didn't unleash issues. Now it should all be fine.

This is to go to the 2.74 branch and we actually might re-AHOY.
2015-03-25 02:33:49 +05:00
ed7e593a4b Fix T43926: Volume scatter: intersecting objects GPU rendering artifacts
Fix T44007: Cycles Volumetrics: block artifacts with overlapping volumes

The issue was caused by uninitialized parameters of some closures, which
lead to unpredictable behavior of shader_merge_closures().
2015-03-23 12:48:33 +05:00
6f3500db05 Cleanup: Remove unused SD_PHASE_HAS_EVAL flag.
We only have a non-singular volume closure and therefore no need to distinguish it.
2015-02-18 16:33:31 +01:00
a2366a3a2e Cleanup for Cycles hair shader ifdefs.
sc->T and sc->data2 were behind __HAIR__ ifdef, now they are not anymore, so we can always assign the correct value.
2015-02-18 15:57:39 +01:00
f3e831f02d Cycles: Fix for hair transmission BSDF not returning proper label 2015-02-17 22:40:00 +05:00
a0d7db503d Cycles: Small tweaks for Henyey Greenstein closure code.
* Avoid duplicative fabs(g) check in sample code.
* Avoid dot product in eval code.

Helps like ~1% when Scatter Anisotropy is 0.
2015-02-17 17:48:18 +01:00
bf878d3c3d Cycles: Remove empty closure blur code and the corresponding entries in the switch.
Most compilers will probably optimize that out, but I still don't see a reason to keep it.
2015-02-17 13:44:25 +01:00
fd4f0ed39e Cleanup: Remove unused code from hair BSDF. 2015-02-16 15:23:01 +01:00
7c3d5a3337 Cycles: Use some more bools in microfacet code. 2015-02-16 12:32:42 +01:00
b03ac83843 Cycles: Correction to glossy shaders not handling total internal reflection
The issue was caused by lack of check for whether fresnel term is actually
giving total internal reflection in refraction BSDFs. This lead to usage of
arbitrary vector of (0, 0, 0) as reflection, giving numeric issues in other
areas of the kernel.

This gives some visual changes of sharp reflection but it seems to be rather
proper now. Which also corresponds with rough glossy reflection with sharpness
set to 0.001 (previously it was totally different from sharpness of 0.0, which
is just weird).
2015-02-10 18:20:36 +05:00
d632ef7c66 Cycles: Use fast math functions in hair BSDF
Precision of the fast functions seems to be enough in there and
since the code was heavily using inverse trigonometric functions
this change gives few percent speedup on Victor's hair.

From the tests files from ctests storage doesn't have any meaningful
difference, hair on Victor is all below 4% absolute error and only
few pixels are exceeding 1% absolute difference.

In any case, let it be as it is currently so it allows us to have
fast math file in sources for it's further evaluation and possible
usage in other areas as well.
2015-01-31 01:49:41 +05:00
a3c13fa9e8 Cycles: Remove confusing labels usage in hair BSDF
BSDF sampler function shouldn't give labels it's not intended to do.
That said reflection shouldn't give transmission ray and transmission
give reflection ray.

Added an assert in the transmission sampling but reflection still
needs some investigation because even after recent fixes the check
for projection onto the reflected ray could give both positive and
negative values.

It shouldn't have any affect on renders just makes internal logic
consistent and unleashes an issue to be investigate further.
2015-01-30 14:00:24 +05:00
a1f4821b94 Fix T42212: Singular reflection pass is incorrect in regular path tracer
Issue seems to be caused by not totally proper pdf and eval values for this
closure. Changed it so they reflect to ggx/beckmann reflection with roughness
set to 0, which is effectively the same as the sharp reflection.
2015-01-20 03:03:45 +05:00
ee36e75b85 Cleanup: Fix Cycles Apache header.
This was already mixed a bit, but the dot belongs there.
2014-12-25 02:50:24 +01:00
0756fe2684 Cleanup: Remove SD_BSDF_GLOSSY flag, unused. 2014-11-20 07:53:22 +01:00
6d1c0260bb Cleanup: Style fixes for closures, mainly bitflags and conditions. 2014-10-29 09:56:21 +01:00
13ca9873c9 Cleanup: Remove unused function in Translucent BSDF. 2014-10-29 09:42:19 +01:00
0f1c3958da Fix typo breaking compilation with rather strict flags (does not like implicit double to float conversion). 2014-10-10 15:11:23 +02:00
5711025765 Cycles: Use a bit better approach for erfinv()
Also reduce number of branching and multiplications a bit by inlining the branches.

This gives an unmeasurable speedup, which is in case of BMW is about 2% here.
2014-10-10 13:40:09 +02:00
4b2fadeaba Cycles: Remove Westin closure.
Was hooked up last year for testing purposes, as we already had some code for it, but the closure itself is not really good nor really useful, so let's remove it.
2014-10-03 16:03:49 +02:00
02f58ac623 Cleanup: Spelling. 2014-10-03 15:28:52 +02:00
1e4d99368b Cycles: Use more accurate implementation of erf() and erfinv()
This functions are  orders of magnitude more accurate than the old ones,
and they're around the same complexity to compute.
2014-10-03 18:28:44 +06:00
da3be518b6 Comment out SVM fresnel_conductor() function for now, still unused. 2014-09-07 21:13:00 +02:00
aa8d25bf09 Cycles / OSL: Add a conductive fresnel shader template.
This adds a fresnel conductive OSL preset to the Text Editor. Based on a patch by Lukas Stockner.
Differential revision: https://developer.blender.org/D145

See the differential for details.
2014-09-07 18:28:59 +02:00
161815576f Cleanup: Remove __ANISOTROPIC__ define.
That was only needed in the beginning, when we did not had support for tangents. It's time to clean some of the defines up, it's getting a bit too much.
2014-08-20 23:23:14 +02:00
7bc87a372e Fix T40962: Ashikhmen Shirley shader fireflies 2014-08-19 20:58:58 +06:00
9c3025cd26 Spelling 2014-08-02 16:53:52 +10:00
a378f8d2d8 Fix T40964: Massive shading failures with glass node mixing, whiteouts and blackouts 2014-07-15 15:59:00 +06:00
e929dc2d8c Fix part of T40964: Glass shader was giving wrong results with OSL. 2014-07-06 13:07:35 +02:00
8fbd71e5f2 Cycles: improved Beckmann sampling using precomputed data
It turns out that the new Beckmann sampling function doesn't work well with
Quasi Monte Carlo sampling, mainly near normal incidence where it can be worse
than the previous sampler. In the new sampler the random number pattern gets
split in two, warped and overlapped, which hurts the stratification, see the
visualization in the differential revision.

Now we use a precomputed table, which is much better behaved. GGX does not seem
to benefit from using a precomputed table.

Disadvantage is that this table adds 1MB of memory usage and 0.03s startup time
to every render (on my quad core CPU).

Differential Revision: https://developer.blender.org/D614
2014-06-21 22:31:44 +02:00
b12151eceb Cycles: glossy and anisotropic BSDF changes
* Anisotropic BSDF now supports GGX and Beckmann distributions, Ward has been
  removed because other distributions are superior.
* GGX is now the default distribution for all glossy and anisotropic nodes,
  since it looks good, has low noise and is fast to evaluate.
* Ashikhmin-Shirley is now available in the Glossy BSDF.
2014-06-14 13:49:57 +02:00
ceb68e809e Cycles: internal code support for anisotropic Beckmann and GGX reflection
Based on:

Understanding the Masking-Shadowing Function in Microfacet-Based BRDFs
E. Heitz, Research Report 2014
2014-06-14 13:49:57 +02:00
Karsten Schwenk
8ce1090d4e Cycles: Ashikhmin-Shirley anisotropic BSDF
* Ashikhmin-Shirley anisotropic BSDF was added as closure
* Anisotropic BSDF node now has two distributions

Reviewers: brecht, dingto

Differential Revision: https://developer.blender.org/D549
2014-06-14 13:49:57 +02:00
f5cb0cf1a5 Cycles: improved importance sampling for Beckmann and GGX glossy
Samples render slower than before, but hopefully this is made up for with
reduced noise in most cases. The main slowdown comes from samples that would
previously be wasted and turn out black, which are now continued.

GGX sampling is about the same speed as before, while for Beckmann it is slower
still. Perhaps optimizations are still possible there, but didn't find anything
easy.

Code from this paper, which comes with sample code:

Importance Sampling Microfacet-Based BSDFs using the Distribution of Visible Normals.
E. Heitz and E. d'Eon, EGSR 2014

Differential Revision: https://developer.blender.org/D572
2014-06-14 13:49:56 +02:00
dc13969e48 Style cleanup: indentation, braces 2014-05-05 02:19:08 +10:00
8d16869d83 Code cleanup: Add -Werror=float-conversion to Cycles 2014-05-03 07:31:46 +10:00
8547d17739 Fix T38615: cycles rendering beckmann/GGX refraction wrong with IOR equal to 1. 2014-02-12 22:50:31 +01:00
a41648c1dc Cycles: add pass alpha threshold value to render layers.
Z, Index, normal, UV and vector passes are only affected by surfaces with alpha
transparency equal to or higher than this threshold. With value 0.0 the first
surface hit will always write to these passes, regardless of transparency. With
higher values surfaces that are mostly transparent can be skipped until an opaque
surface is encountered.
2014-02-06 15:24:15 +01:00
7b0a46b1ff Fix CUDA/OpenCL compile errors in scattering commit. 2014-01-07 15:48:04 +01:00