This patch adds a new Vertex Color node. The node also returns the alpha
of the vertex color layer as an output.
Reviewers: brecht
Differential Revision: https://developer.blender.org/D5767
Float2 are now a new type for attributes in Cycles. Before, the choices
for attribute storage were float and float3, the latter padded to
float4. This meant that UV maps were inflated to twice the size
necessary.
Reviewers: brecht, sergey
Reviewed By: brecht
Subscribers: #cycles
Tags: #cycles
Differential Revision: https://developer.blender.org/D4409
This commit contains the first part of the new Cycles denoising option,
which filters the resulting image using information gathered during rendering
to get rid of noise while preserving visual features as well as possible.
To use the option, enable it in the render layer options. The default settings
fit a wide range of scenes, but the user can tweak individual settings to
control the tradeoff between a noise-free image, image details, and calculation
time.
Note that the denoiser may still change in the future and that some features
are not implemented yet. The most important missing feature is animation
denoising, which uses information from multiple frames at once to produce a
flicker-free and smoother result. These features will be added in the future.
Finally, thanks to all the people who supported this project:
- Google (through the GSoC) and Theory Studios for sponsoring the development
- The authors of the papers I used for implementing the denoiser (more details
on them will be included in the technical docs)
- The other Cycles devs for feedback on the code, especially Sergey for
mentoring the GSoC project and Brecht for the code review!
- And of course the users who helped with testing, reported bugs and things
that could and/or should work better!
We started to run out of bits there, so now we separate flags
which came from __object_flags and which are either runtime or
coming from __shader_flags.
Rule now is: SD_OBJECT_* flags are to be tested against new
object_flags field of ShaderData, all the rest flags are to
be tested against flags field of ShaderData.
There should be no user-visible changes, and time difference
should be minimal. In fact, from tests here can only see hardly
measurable difference and sometimes the new code is somewhat
faster (all within a noise floor, so hard to tell for sure).
Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself
Differential Revision: https://developer.blender.org/D2428
Some of the files were wrongly attributing code to some other
organizations and in few places proper attribution was missing.
This is mainly either a copy-paste error (when new file was
created from an existing one and header wasn't updated) or due
to some refactor which split non-original-BF code with purely
BF code.
Should solve some confusion around.
Using ones complement for detecting if transform has been applied was confusing
and led to several bugs. With this proper checks are made.
Also added a few transforms where they were missing, mostly affecting baking
and displacement when `P` is used in the shader (previously `P` was in the
wrong space for these shaders)
Also removed `TIME_INVALID` as this may have resulted in incorrect
transforms in some cases.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2192
Adds a descriptor for attributes that can easily be passed around and extended
to contain more data. Will be used for attributes on subdivision meshes.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2110
There are several internal changes for this:
First idea is to make __tri_verts to behave similar to __tri_storage,
meaning, __tri_verts array now contains all vertices of all triangles
instead of just mesh vertices. This saves some lookup when reading
triangle coordinates in functions like triangle_normal().
In order to make it efficient needed to store global triangle offset
somewhere. So no __tri_vindex.w contains a global triangle index which
can be used to read triangle vertices.
Additionally, the order of vertices in that array is aligned with
primitives from BVH. This is needed to keep cache as much coherent as
possible for BVH traversal. This causes some extra tricks needed to
fill the array in and deal with True Displacement but those trickery
is fully required to prevent noticeable slowdown.
Next idea was to use this __tri_verts instead of __tri_storage in
intersection code. Unfortunately, this is quite tricky to do without
noticeable speed loss. Mainly this loss is caused by extra lookup
happening to access vertex coordinate.
Fortunately, tricks here and there (i,e, some types changes to avoid
casts which are not really coming for free) reduces those losses to
an acceptable level. So now they are within couple of percent only,
On a positive site we've achieved:
- Few percent of memory save with triangle-only scenes. Actual save
in this case is close to size of all vertices.
On a more fine-subdivided scenes this benefit might become more
obvious.
- Huge memory save of hairy scenes. For example, on koro.blend
there is about 20% memory save. Similar figure for bunny.blend.
This memory save was the main goal of this commit to move forward
with Hair BVH which required more memory per BVH node. So while
this sounds exciting, this memory optimization will become invisible
by upcoming Hair BVH work.
But again on a positive side, we can add an option to NOT use Hair
BVH and then we'll have same-ish render times as we've got currently
but will have this 20% memory benefit on hairy scenes.
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.
Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.
Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.
This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.
More feature will be enabled once they're ported to the split kernel and
tested.
Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.
Based on the research paper:
https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf
Here's the documentation:
https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit
Design discussion of the patch:
https://developer.blender.org/T44197
Differential Revision: https://developer.blender.org/D1200
Using this paper: Sven Woop, Watertight Ray/Triangle Intersection
http://jcgt.org/published/0002/01/05/paper.pdf
This change is expected to address quite reasonable amount of reports from the
bug tracker, plus it might help reducing the noise in some scenes.
Unfortunately, it's currently about 7% slower than the previous solution with
pre-computed triangle plane equations, but maybe with some smart tweaks to the
code (tests reshuffle, using SIMD in a nice way or so) we can avoid the speed
regression.
But perhaps smartest thing to do here would be to change single triangle / ray
intersection with multiple triangles / ray intersections. That's how Embree does
this and it's watertight single ray intersection is not any faster that this.
Currently only triangle intersection is modified accordingly to the paper, in
the future we would also want to modify the node / ray intersection.
Reviewers: brecht, juicyfruit
Subscribers: dingto, ton
Differential Revision: https://developer.blender.org/D819
This way extending intersection routines with some pre-calculation step wouldn't
explode the single file size, hopefully keeping them all in a nice maintainable
state.
Not as if it gives noticeable changes render-time, but it's just weird to
convert float4 to float 3 to just access individual x/y/z components.
Plus some compilers might be more stupid than GCC and don't optimize this
out well.
tri_shader does no longer need to a float.
Reviewers: dingto, sergey
Reviewed By: dingto, sergey
Subscribers: dingto
Projects: #cycles
Differential Revision: https://developer.blender.org/D789
Root of the issue goes back to the on-fly normals commit and the
latest fix for it wasn't actually correct. I've mixed two fixes
in there.
So the idea here goes back to storing negative scaled object flag
and flip runtime-calculated normal if this flag is set, which is
pretty much the same as the original fix for the issue from me.
The issue with motion blur wasn't caused by the rumtime normals
patch and it had issues before, because it already did runtime
normals calculation. Now made it so motion triangles takes the
negative scale flag into account.
This actually makes code more clean imo and avoids rather confusing
flipping code in mesh.cpp.
Fix T41115: Motion Blur renders Objects Black - But not in Viewport Preview
This actually extends previous fix to normals and makes it all much nicer now.
Worth doing some intense testing, quick one worked just fine but there always
could be some corner cases.
Fix T41079: Solid black render of object with negative scale and smooth shading
In both cases the issue was caused by negative scaled objects with single mesh
users for which scale gets applied when using static BVH.
Since the on-fly normals calculation land normals for such cases weren't flipped
leading them to point to a wrong direction.
Added a special object flag for this, which is a bit of a bummer because now
we've got less bits for real useful things, but this is the only way to get
proper normals without adding more complexity in the on-fly calculations.
* Added support for uchar4 attributes to Cycles' attribute system.
* This is used for Vertex Colors now, which saves some memory (4 unsigned characters, instead of 4 floats).
* GPU Texture Limit on sm_20 and sm_21 decreased from 95 to 94, because we need a new texture for the uchar4 attributes. This is no problem for sm_30 or newer.
Part of my GSoC 2014.
Instead of pre-calculation and storage, we now calculate the face normal during render.
This gives a small slowdown (~1%) but decreases memory usage, which is especially important for GPUs,
where you have limited VRAM.
Part of my GSoC 2014.