Commit Graph

158 Commits

Author SHA1 Message Date
23cc453975 Fix T48732: New GGX breaks OpenCL kernel
Make sure we don't perform any implicit address space conversion.

A bit annoying, but less intrusive approaches (like using temp private
variable in .cl kernel) do not work correct here.

Using generic address space will help from code side here, but will
be somewhat slower due to extra things happening as far as i know.
2016-06-28 17:15:35 +05:00
2a69b09b62 Fix T48732 v2: New GGX breaks OpenCL kernel
As far as I can see, the second issue there was that the functions receive a pointer to a member variable of the
ShaderData, which is stored in global memory. However, this means that the pointer points to global memory as well,
therefore OpenCL requires the ccl_addr_space "keyword" in front of the pointer.
With this commit, the OpenCL kernels build on Linux with the Intel CPU OpenCL runtime - however, they already did
without the change and I don't have an AMD card, so I can't really test whether the AMD runtime is happy as well now.
2016-06-26 00:51:16 +02:00
99088f8b55 Fix T48732, OpenCL compile failure after Multiscatter GGX commit.
Use OpenCL "all" builtin type for conversion, according to OpenCL 1.1 spec 6.3e.
2016-06-25 11:14:06 +02:00
23c276832b Cycles: Add multi-scattering, energy-conserving GGX as an option to the Glossy, Anisotropic and Glass BSDFs
This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the
multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model".

Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes
the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until
the ray leaves it again, which ensures perfect energy conservation.

In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing
roughness - is solved in a physically correct and efficient way.

The downside of this model is that it has no (known) analytic expression for evalation. However, it can be
evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the
balance heuristic guarantee an unbiased result at the cost of slightly higher noise.

Reviewers: dingto, #cycles, brecht

Reviewed By: dingto, #cycles, brecht

Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel

Differential Revision: https://developer.blender.org/D2002
2016-06-23 22:57:26 +02:00
7ac126e728 Fix T46492: GGX distribution produces black pixels
The issue was caused by some numerical instability.
2016-06-17 16:30:29 +02:00
84c68dcb3f Cycles: Minor cleanup, whitespace around keyword and preprocessor indent 2016-04-13 08:58:52 +02:00
700722f686 Cycles: Cleanup, indent nested preprocessor directives
Quite straightforward, main trick is happening in path_source_replace_includes().

Reviewers: brecht, dingto, lukasstockner97, juicyfruit

Differential Revision: https://developer.blender.org/D1794
2016-03-25 13:55:42 +01:00
69dc0c3192 Cycles: Fixes for Burley BSSRDF
There are several fixes in here, which hopefully will make the shader
working correct without too much magic in there.

First of all, this commit brings BURLEY_TRUNCATE down from 30 to 16
which reduces noise a lot. It's still higher than original truncate
from Brecht, but this reduces PDF value at a cutoff distance by an
order of magnitude (now it's 0.008387, previously it was 0.063521
for the albedo of 0.8 and radius 1.0). This should converge to a
proper result faster and don't have artifacts.

This kind of reverts fix for T47356, but after additional thinking
came to conclusion Burley is not being totally smooth, it is about
giving less waxy results which it's kind of doing in the file.

Second of all, this commit fixes burley_eval() to use normalized
diffusion reflectance. This matches the way we calculate CDF and
solves numeric instability close to 0, making PDF profile looking
closer to other SSS profiles:

  https://developer.blender.org/F282355
  https://developer.blender.org/F282356
  https://developer.blender.org/F282357

Reviewers: brecht

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D1792
2016-02-13 13:29:13 +01:00
2ac88328b0 Cycles: Fix Burley's CDF truncation after recent radius fix
This is all not really ideal, but good enough for tonight.
More thoughts and investigation tomorrow!
2016-02-08 21:50:38 +01:00
dae8326d1e Fix T47356: Too sharp falloww with Burley BSSRDF
After the clamping commit we need to bump BURLEY_TRUNCATE
constant a bit, otherwise mean free path does not really
match the disk radius needed for importance sampling.
2016-02-08 14:54:11 +01:00
4443bb8922 Fix Burley BSSRDF NaNs and fireflies.
Explicitly truncate to Rm same way as the Gaussian BSSRDF, and use safe_sqrtf()
to be sure in case of float precision issues.
2016-02-06 11:52:43 +01:00
e688a62712 Cycles: Fix for initial guess of the radius for Burley BSSRDF
The value was too high, causing bad Newton iteration step.
Now the value is not so good, but it's still within 9 iterations
and those high number of iterations are only happening in
approx 1% of input values.
2016-02-05 10:06:08 +01:00
3e7389eaf2 Cycles: Speedup of Christensen-Burley SSS falloff function
The idea is simply to pre-compute fitting and parameterization
in the bssrdf_setup() function and re-use the values in both
sample() and eval().

The only trick is where to store the pre-calculated values and
the answer is inside of ShaderClosure->custom{1,2,3}. There's
no memory bump here because we now simply re-use padding fields
for the pre-calculated values. Similar trick we can do for other
BSDFs.

Seems to give nice speedup up to 7% here on my desktop with
Core i7 CPU, SSE4.1 kernel.
2016-02-04 15:29:58 +01:00
ad26407b52 Cycles: Implement approximate reflectance profiles
Using this paper:

  http://graphics.pixar.com/library/ApproxBSSRDF/paper.pdf

This model gives less blurry results than the Cubic and Gaussian
we had implemented:

- Cubic: https://developer.blender.org/F279670
- Burley: https://developer.blender.org/F279671

The model is called "Christensen-Burley" in the interface, which
actually should be read as "Physically based" or "Realistic".

Reviewers: juicyfruit, dingto, lukasstockner97, brecht

Reviewed By: brecht, dingto

Subscribers: robocyte

Differential Revision: https://developer.blender.org/D1759
2016-02-04 13:27:23 +05:00
e42852a339 Cycles: Cleanup and reference actual paper used for BSSRDF sampling 2016-02-02 18:06:29 +01:00
12b7850d4f Cycles: Fix for transmissive microfacet sampling
This is an alternate fix for T40964 which resolves bad handling of
caustics reported in T45609.

There were too much transmission rays being discarded by the original
fix, which caused by caustic light being totally disabled. There is
still some room for investigation why exactly original paper didn't
work that well, could be caused by the way how the pdf is calculated.

In any case current results seems rather correct now.
2015-07-31 13:46:58 +02:00
6a0a205cb4 Cycles: Simplify volume_phase_eval().
This simplification is safe, as the call to volume_phase_eval() is guarded behind a CLOSURE_IS_PHASE check, which is equal to
CLOSURE_VOLUME_HENYEY_GREENSTEIN_ID. I don't think we will add more phase functions anytime soon, if at all.
2015-06-11 15:18:33 +02:00
7f4479da42 Cycles: OpenCL kernel split
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.

Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.

Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.

This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.

More feature will be enabled once they're ported to the split kernel and
tested.

Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.

Based on the research paper:

  https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf

Here's the documentation:

  https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit

Design discussion of the patch:

  https://developer.blender.org/T44197

Differential Revision: https://developer.blender.org/D1200
2015-05-09 19:52:40 +05:00
ae7d84dbc1 Cycles: Use native saturate function for CUDA
This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1)
into a single instruction and uses 4 instructions instead.

Original patch by @lockal with own modification:

  Don't make changes outside of the kernel. They don't make any difference
  anyway and term saturate() has a bit different meaning outside of kernel.

This gives around 2% of speedup in Barcelona file, but in more complex shader
setups with lots of math nodes with clamping speedup could be much nicer.

Subscribers: dingto

Projects: #cycles

Differential Revision: https://developer.blender.org/D1224
2015-04-28 00:38:32 +05:00
bc160d8a85 Cleanup: Code style. 2015-04-26 00:42:26 +02:00
e2354e64d2 Cycles: Cleanup, spaces around assignment operator
Did some bad spacing in recent commits, better to get rid of those so
they does not confuse those who're working on sources.
2015-04-07 00:25:54 +05:00
a9bb8d8a73 Cycles: de-duplicate fast/approximate erf function calculation
Our own implementation is in fact the same performance as in fast_math from
OpenShadingLanguage, but implementation from fast_math is using explicit madd
function, which increases chance of compiler deciding to use intrinsics.
2015-04-06 12:49:44 +05:00
b06962fcfe Cycles: Avoid using lookup table for Beckmann slopes on GPU
This patch is based on some work done in D788 and re-formulation from Beckmann
implementation in OpenShadingLanguage.

Skipping texture lookup helps a lot on GPUs where it's more expensive to access
texture memory than to do some extra calculation in threads.

CPU code still uses lookup-table based approach since this seems to be still
faster (at least on computers i've got access to).

This change gives about 2% speedup on BMW scene with GTX560TI.
2015-04-05 19:07:45 +05:00
252b36ce77 Cycles: Remove unused Beckmann slope sampling code
It did not preserve stratification too well and lookup-table approach was
working much better. There are now also some more interesting forumlation
from Wenzel and OpenShadingLanguage which should work better than old code.
2015-04-05 19:07:44 +05:00
af399884e1 Fix T44113: Ashikhmin-Shirley distribution of glossy shader at 0 roughness causes artifacts when background uses MIS
Was a division by zero error, solved in the same way as beckmann/ggx
deals with small roughness values.
2015-04-01 14:21:21 +05:00
5ff132182d Cycles: Code cleanup, spaces around keywords
This inconsistency drove me totally crazy, it's really confusing
when it's inconsistent especially when you work on both Cycles and
Blender sides.

Shouldn;t cause merge PITA, it's whitespace changes only, Git should
be able to merge it nicely.
2015-03-28 00:15:15 +05:00
87cff57207 Fix T44123: Cycles SSS renders black in recent builds
Issue was introduced in 01ee21f where i didn't notice *_setup()
function only doing partial initialization, and some of parameters
are expected to be initialized by callee function.

This was hitting only some setups, so tests with benchmark scenes
didn't unleash issues. Now it should all be fine.

This is to go to the 2.74 branch and we actually might re-AHOY.
2015-03-25 02:33:49 +05:00
ed7e593a4b Fix T43926: Volume scatter: intersecting objects GPU rendering artifacts
Fix T44007: Cycles Volumetrics: block artifacts with overlapping volumes

The issue was caused by uninitialized parameters of some closures, which
lead to unpredictable behavior of shader_merge_closures().
2015-03-23 12:48:33 +05:00
6f3500db05 Cleanup: Remove unused SD_PHASE_HAS_EVAL flag.
We only have a non-singular volume closure and therefore no need to distinguish it.
2015-02-18 16:33:31 +01:00
a2366a3a2e Cleanup for Cycles hair shader ifdefs.
sc->T and sc->data2 were behind __HAIR__ ifdef, now they are not anymore, so we can always assign the correct value.
2015-02-18 15:57:39 +01:00
f3e831f02d Cycles: Fix for hair transmission BSDF not returning proper label 2015-02-17 22:40:00 +05:00
a0d7db503d Cycles: Small tweaks for Henyey Greenstein closure code.
* Avoid duplicative fabs(g) check in sample code.
* Avoid dot product in eval code.

Helps like ~1% when Scatter Anisotropy is 0.
2015-02-17 17:48:18 +01:00
bf878d3c3d Cycles: Remove empty closure blur code and the corresponding entries in the switch.
Most compilers will probably optimize that out, but I still don't see a reason to keep it.
2015-02-17 13:44:25 +01:00
fd4f0ed39e Cleanup: Remove unused code from hair BSDF. 2015-02-16 15:23:01 +01:00
7c3d5a3337 Cycles: Use some more bools in microfacet code. 2015-02-16 12:32:42 +01:00
b03ac83843 Cycles: Correction to glossy shaders not handling total internal reflection
The issue was caused by lack of check for whether fresnel term is actually
giving total internal reflection in refraction BSDFs. This lead to usage of
arbitrary vector of (0, 0, 0) as reflection, giving numeric issues in other
areas of the kernel.

This gives some visual changes of sharp reflection but it seems to be rather
proper now. Which also corresponds with rough glossy reflection with sharpness
set to 0.001 (previously it was totally different from sharpness of 0.0, which
is just weird).
2015-02-10 18:20:36 +05:00
d632ef7c66 Cycles: Use fast math functions in hair BSDF
Precision of the fast functions seems to be enough in there and
since the code was heavily using inverse trigonometric functions
this change gives few percent speedup on Victor's hair.

From the tests files from ctests storage doesn't have any meaningful
difference, hair on Victor is all below 4% absolute error and only
few pixels are exceeding 1% absolute difference.

In any case, let it be as it is currently so it allows us to have
fast math file in sources for it's further evaluation and possible
usage in other areas as well.
2015-01-31 01:49:41 +05:00
a3c13fa9e8 Cycles: Remove confusing labels usage in hair BSDF
BSDF sampler function shouldn't give labels it's not intended to do.
That said reflection shouldn't give transmission ray and transmission
give reflection ray.

Added an assert in the transmission sampling but reflection still
needs some investigation because even after recent fixes the check
for projection onto the reflected ray could give both positive and
negative values.

It shouldn't have any affect on renders just makes internal logic
consistent and unleashes an issue to be investigate further.
2015-01-30 14:00:24 +05:00
a1f4821b94 Fix T42212: Singular reflection pass is incorrect in regular path tracer
Issue seems to be caused by not totally proper pdf and eval values for this
closure. Changed it so they reflect to ggx/beckmann reflection with roughness
set to 0, which is effectively the same as the sharp reflection.
2015-01-20 03:03:45 +05:00
ee36e75b85 Cleanup: Fix Cycles Apache header.
This was already mixed a bit, but the dot belongs there.
2014-12-25 02:50:24 +01:00
0756fe2684 Cleanup: Remove SD_BSDF_GLOSSY flag, unused. 2014-11-20 07:53:22 +01:00
6d1c0260bb Cleanup: Style fixes for closures, mainly bitflags and conditions. 2014-10-29 09:56:21 +01:00
13ca9873c9 Cleanup: Remove unused function in Translucent BSDF. 2014-10-29 09:42:19 +01:00
0f1c3958da Fix typo breaking compilation with rather strict flags (does not like implicit double to float conversion). 2014-10-10 15:11:23 +02:00
5711025765 Cycles: Use a bit better approach for erfinv()
Also reduce number of branching and multiplications a bit by inlining the branches.

This gives an unmeasurable speedup, which is in case of BMW is about 2% here.
2014-10-10 13:40:09 +02:00
4b2fadeaba Cycles: Remove Westin closure.
Was hooked up last year for testing purposes, as we already had some code for it, but the closure itself is not really good nor really useful, so let's remove it.
2014-10-03 16:03:49 +02:00
02f58ac623 Cleanup: Spelling. 2014-10-03 15:28:52 +02:00
1e4d99368b Cycles: Use more accurate implementation of erf() and erfinv()
This functions are  orders of magnitude more accurate than the old ones,
and they're around the same complexity to compute.
2014-10-03 18:28:44 +06:00
da3be518b6 Comment out SVM fresnel_conductor() function for now, still unused. 2014-09-07 21:13:00 +02:00
aa8d25bf09 Cycles / OSL: Add a conductive fresnel shader template.
This adds a fresnel conductive OSL preset to the Text Editor. Based on a patch by Lukas Stockner.
Differential revision: https://developer.blender.org/D145

See the differential for details.
2014-09-07 18:28:59 +02:00