This is also an important mathematical operation that can be folded
if it is known that one argument is a certain constant. For colors
the operation is provided as a Gamma node.
The SVM Gamma node needs a small fix to make it follow the 0 ^ 0 == 1
rule, same as the Power node, or the Gamma node itself in OSL mode.
Reviewers: #cycles
Differential Revision: https://developer.blender.org/D2263
Using ones complement for detecting if transform has been applied was confusing
and led to several bugs. With this proper checks are made.
Also added a few transforms where they were missing, mostly affecting baking
and displacement when `P` is used in the shader (previously `P` was in the
wrong space for these shaders)
Also removed `TIME_INVALID` as this may have resulted in incorrect
transforms in some cases.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2192
Bump mapping was happening in world space while displacement happens in object
space, causing shading errors when displacement type was used with bump mapping.
To fix this the proper transforms are added to bump nodes. This is only done
for automatic bump mapping however, to avoid visual changes from other uses of
bump mapping. It would be nice to do this for all bump mapping to be consistent
but that will have to wait till we can break compatibility.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2191
The two SVM nodes added with e7ea1ae78c caused a slowdown on AMD cards when rendering with OpenCL, whether displacement was used or not.
In the Barcelona Pavillon scene on a RX480, this would cause a 12% slowdown.
Therefore, this commit adds a additional flag for feature-adaptive compilation so that the new SVM nodes are only enabled when they are needed (Node tree connected to the Displacement output and Displacement type set to Both).
Also, the nodes were also added to shaders when the Displacement Type was set to Bump (the default), which was unneccessary and is fixed now.
Thanks to linda2 on IRC for reporting and testing and to maiself for help with the displacement shader code.
This fix might be relevant for 2.78, but it should be tested further before including it.
Object coordinates can now be used in the displacement shader and will give
correct results, where as before bump mapping was calculated from the displace
positions and resulted in incorrect shading.
This works by evaluating the shader in two parts, first bump then surface, and
setting the shader state to match what it would be if the surface was
undisplaced for the bump shader evaluation. Currently only `P` is set as if
undisplaced, but other shader variables could be set as well, such as `I` or
`time`. Since these aren't set to anything meaningful for displacement I left
them out of this patch, we can decide what to do with them separately.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2156
There basically are two issues here: in smooth mode (and all non-tangent
normal map types) it doesn't invert the normal for backfacing polys;
on the other hand for flat shaded tangent type it is inverted too soon.
This fix does a brute force correction by checking the backfacing flag.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Differential Revision: https://developer.blender.org/D2181
Changes from microdisplacement work broke previous support for subdivision
meshes, sometimes leading to crashes; this makes things work again. Files
that contain "patch" nodes will need to be updated to use meshes instead, as
specifying patches was both inefficient and completely unsupported by the new
subdivision code.
This way OpenCL devices can also benefit from a smaller memory footprint, when using e.g. bumpmaps (greyscale, 1 channel).
Additional target for my GSoC 2016.
Now we have the 4 component ones first (float4, byte4, half4) followed by the 1 component ones (float, byte, half).
Makes code a bit more consistent and also reduces code a bit when enabling half support on GPU in next commit.
This also exposed a typo in half CPU images for 3D textures, which wasn't used yet, but good to have that one fixed anyway.
Adds a descriptor for attributes that can easily be passed around and extended
to contain more data. Will be used for attributes on subdivision meshes.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2110
All the changes are mainly giving explicit tips on inlining functions,
so they match how inlining worked with previous toolkit.
This make kernel compiled by CUDA 8 render in average with same speed
as previous kernels. Some scenes are somewhat faster, some of them are
somewhat slower. But slowdown is within 1% so far.
On a positive side it allows us to enable newer generation cards on
buildbots (so GTX 10x0 will be officially supported soon).
These are complex nodes, and it's conceivable they may end up constant
in some circumstances within node groups, so folding support is useful.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2084
This adds support for ngons and attributes on subdivision meshes. Ngons are
needed for proper attribute interpolation as well as correct Catmull-Clark
subdivision. Several changes are made to achieve this:
- new primitive `SubdFace` added to `Mesh`
- 3 more textures are used to store info on patches from subd meshes
- Blender export uses loop interface instead of tessface for subd meshes
- `Attribute` class is updated with a simplified way to pass primitive counts
around and to support ngons.
- extra points for ngons are generated for O(1) attribute interpolation
- curves are temporally disabled on subd meshes to avoid various bugs with
implementation
- old unneeded code is removed from `subd/`
- various fixes and improvements
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2108
- In fresnel_dielectric, the differentials calculation sometimes divided by zero.
- When the normal map was (0.5, 0.5, 0.5), the code would try to normalize a zero vector. Now, it just uses the regular normal as a fallback.
- The approximate error function used in Beckmann sampling sometimes overflowed to inf while calculating r^16. The final value is 1 - 1/r^16, however,
so now it just returns 1 if the computation would overflow otherwise.
This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the
multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model".
Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes
the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until
the ray leaves it again, which ensures perfect energy conservation.
In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing
roughness - is solved in a physically correct and efficient way.
The downside of this model is that it has no (known) analytic expression for evalation. However, it can be
evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the
balance heuristic guarantee an unbiased result at the cost of slightly higher noise.
Reviewers: dingto, #cycles, brecht
Reviewed By: dingto, #cycles, brecht
Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel
Differential Revision: https://developer.blender.org/D2002
Some closures were missing from calculation, leading to an array
under-allocation, presumable causing memory corruption issues with
emission shaders on OpenCL and was causing issues with Volume 3D
textures with CUDA.
The issue was identified by Thomas Dinges, the patch is different
from the original D2006. See the brief discussion there. Current
approach is similar (or the same) as Brecht suggested.
This adds support for CUDA Texture objects (also known as Bindless textures) for Kepler GPUs (Geforce 6xx and above).
This is used for all 2D/3D textures, data still uses arrays as before.
User benefits:
* No more limits of image textures on Kepler.
We had 5 float4 and 145 byte4 slots there before, now we have 1024 float4 and 1024 byte4.
This can be extended further if we need to (just change the define).
* Single channel textures slots (byte and float) are now supported on Kepler as well (1024 slots for each type).
ToDo / Issues:
* 3D textures don't work yet, at least don't show up during render. I have no idea whats wrong yet.
* Dynamically allocate bindless_mapping array?
I hope Fermi still works fine, but that should be tested on a Fermi card before pushing to master.
Part of my GSoC 2016.
Reviewers: sergey, #cycles, brecht
Subscribers: swerner, jtheninja, brecht, sergey
Differential Revision: https://developer.blender.org/D1999
Title says it all, this adds OpenCL float4 texture support.
There is a bug in the code still, I get a "Out of ressources error" on nvidia hardware here, not sure whats wrong yet.
Will investigate further, but maybe someone else has an idea. :)
Reviewers: #cycles, brecht
Subscribers: brecht, candreacchio
Differential Revision: https://developer.blender.org/D1983
Instead of treating Fermi GPU limits as default,
and overriding them for other devices,
we now nicely set them for each platform.
* Due to setting values for all platforms,
we don't have to offset the slot id for OpenCL anymore,
as the image manager wont add float images for OpenCL now.
* Bugfix: TEX_NUM_FLOAT_IMAGES was always 5, even for CPU,
so the code in svm_image.h clamped float textures with alpha on CPU after the 5th slot.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht
Differential Revision: https://developer.blender.org/D1925
Seems particular CUDA implementations has some precision issues,
which made integer coordinate (which was expected to always be
positive) to go negative.
The improved Hosek / Wilkie model was added during my GSoC 2013 and the default since then.
The older model was kinda kept for compatibility, but after more than 2 years it's time to remove it.
The Hosek / Wilkie model is more realistic anyway, and people who really want a day / night transition can mix the Sky Shader with another one (e.g. color) and fade between the two.
The issue here was actually somewhere else - the attached scene from the report used a light falloff node in a sunlamp (aka distant light).
However, since distant lamps set the ray length to FLT_MAX and the light falloff node squares this value, it overflows and produces a NaN
weight, which propagates and leads to a NaN intensity, which is then clamped to zero and produces the black pixels.
To fix that issue, the smoothing part of the light falloff is just ignored if the smoothing term isn't finite (which makes sense since
the term should converge to 1 as the distance increases).
The reason for the different results on CPUs and GPUs is not perfectly clear, but probably can be explained with different handling of
Inf/NaN edge cases.
Also, to notice issues like these faster in the future, kernel_asserts were added that evaluate as false as soon as a non-finite intensity is produced.
Supports both smoke/fire and point density textures now.
Reduces number of textures available for sm_20 and sm_21, but you have
to compromise somewhere on such a limited hardware.
Currently limited to linear interpolation only, and decoupled ray
marching is not supported yet. Think those could be considered just a
further improvement.
Some quick example:
https://developer.blender.org/F282934
Code is minimal and we can fully consider it a fix for missing
support of 3D textures with CUDA.
Reviewers: lukasstockner97, brecht, juicyfruit, dingto
Reviewed By: brecht, juicyfruit, dingto
Subscribers: mib2berlin
Differential Revision: https://developer.blender.org/D1806