* Added support for uchar4 attributes to Cycles' attribute system.
* This is used for Vertex Colors now, which saves some memory (4 unsigned characters, instead of 4 floats).
* GPU Texture Limit on sm_20 and sm_21 decreased from 95 to 94, because we need a new texture for the uchar4 attributes. This is no problem for sm_30 or newer.
Part of my GSoC 2014.
This kernel is compiled with AVX2, FMA3, and BMI compiler flags. At the moment only Intel Haswell benefits from this, but future AMD CPUs will have these instructions as well.
Makes rendering on Haswell CPUs a few percent faster, only benchmarked with clang on OS X though.
Part of my GSoC 2014.
Instead of pre-calculation and storage, we now calculate the face normal during render.
This gives a small slowdown (~1%) but decreases memory usage, which is especially important for GPUs,
where you have limited VRAM.
Part of my GSoC 2014.
This makes the code a bit easier to understand, and might come in handy
if we want to reuse more Embree code.
Differential Revision: https://developer.blender.org/D482
Code by Brecht, with fixes by Lockal, Sergey and myself.
This means packed images and movies are now supported when using OSL
backend for material shading.
Uses special file name to distinguish whether image is builtin or not.
This part might become a bit smarted or optimized a bit, but it's good
enough with this implementation already.
Now baking does one AA sample at a time, just like final render. There is
also some code for shader antialiasing that solves T40369 but it is disabled
for now because there may be unpredictable side effects.
it's possible that runtime optimizer would call get_attribute
with NULL renderstate. As per documentation, it's valid to
return false in that cases and in worst case we'll just miss
some possible optimization.
Supporting such cases would require some bigger changes to
Cycles since attributes are only set to up for the kernel
after shader compilation.
Thanks Brecht for review!
Transparent objects could become subtly visible by the different sampling
patterns for pixels covered and not covered by the object. It still converged
to the right solution but that can take a while. Now we try to use the same
sampling pattern here.
This reverts commit 12abe94de8.
After a long discussion in the bug tracker we decided baking should use
the faces normals for glossy (and combined). This is what Blender
Internal is doing, and one of the more predictable way of yielding
predictable results.
That also means the result will not match the render perfectly, but this
is preferrable over the alternatives at hand.
Conflicts:
intern/cycles/kernel/kernel_bake.h
Comments from Brecht Van Lommel:
"""
Currently the viewing direction for each pixel is set to the normal, so
at every pixel glossy is evaluated as if you're looking straight at it.
Blender Internal works the same.
"""
This patch makes baking glossy as viewed from the camera.
Reviewers: brecht
CC: zanqdo
Differential Revision: https://developer.blender.org/D555
The kernel for baking the world texture was the same as the one used for
baking. Now that's separate which allows the kernel to reserve much less
memory.
CMake had this --fast-math flag but scons not, makes a big difference on some
files. Slightly slower rendering might still happen though, but it should not
be this much.
If we are using a mix node we still need to evaluate the BSDF lighting
even if scattering is successful.
Note: this was working for branched path (probably an oversight when
branched path support was introduced for baking, a good oversight though
;)
Instead of 95, we can use 145 images now. This only affects Kepler and above (sm30, sm_35 and sm_50).
This can be increased further if needed, but let's first test if this does not come with a performance impact.
Originally developed during my GSoC 2013.
This fixes the SSS Direct/Indirect passes as well as the Combined pass.
Patch reviewed and with fixes and contributions from Brecht van Lommel.
Note: displacement/bump map (related to the report) will be handled separately
Reviewers: brecht
Differential Revision: https://developer.blender.org/D503
Now the COMBINED pass includes the Ambient Occlusion.
This was not reported anywhere, but while working in the Subsurface Scattering I realize we needed this fix for combined.
This was faster for my AMD system but slower for Intel.
However with gcc4.9,-O3 I was able to get roughly the same speed before/after.
Revert since this isnt giving such clear benefits on most systems.