Quite striaghtforward change, and in theory we can even try supporting motion
blur for the corner pin node (which is tricky because coordinates actually
coming from sockets, but with some black magic should be doable).
The issue was caused by the conflict between preview render which would set
R_NO_IMAGE_LOAD flag on the renderer and texture samplers called outside of
the render pipeline trying to use this flag.
Now the sampler functions accepts extra argument so render pipeline can
still skip image load, but calls outside of the pipeline will nicely load
all the images.
Not cleanest change in the world but good enough to unlock gooseberry team,
and assuming we already had pool passed all over the place it should be all
fine.
Will need to reshuffle arguments into SamplerOptions structure later.
Different interpolation methods in compositor could lead to 0.5 pixel offset in
final renders. This is because of some inconsistency in integer coordinates
which might mean pixel corner or pixel center.
Should be all fine now.
The compostor used a fixed size of 4 floats to hold pixel data. this
patch will select size of a pixel based on its type.
It uses 1 float for Value, 3 float for vector and 4 floats for color
data types.
When benchmarking on shots (opening shot of caminandes) we get a
reduction of memory of 30% and a tiny speedup as less data
transformations needs to take place (but these are negligable.
More information of the patch can be found on
https://developer.blender.org/D627 and
http://wiki.blender.org/index.php/Dev:Ref/Proposals/Compositor2014_p1.1_TD
Developers: jbakker & mdewanchand
Thanks for Sergey for his indept review.
Buffers should actually be cleared before running operations on them,
but this doesn't work for some reason.
Note also that the sunbeams node can show some creases and hard aliasing
when the source point is close to a bright area with strong gradient.
To fix this a better filtering algorithm, dithering or ray sampling
would need to be implemented. In the meantime simply blurring the
sunbeams result a bit should help (or simply avoid putting the source on
a bright spot).
Pretty much straightforward change which gives around 30%
speedup on my laptop and around 2x speedup on desktop in
the BI (which uses gts580). Tested with huge blurs (like
10% of blur) which was rather common during Caminandes.
For now OpenCL is only limited for blur size more than
100 pixels.
This is a bit experimental still, feedback is welcome.
Reviewers: jbakker, lukastoenne
Subscribers: ton
Differential Revision: https://developer.blender.org/D576
The UV values includes the image width/height now. To restore the
previous method as close as possible (even though it is not documented
anywhere how this is supposed to work), we have to ignore this scaling.
This ensures that the beams color does not darken along borders,
by using the last valid color of the ray as the border color (extending
colors in the direction of the source point).
The sunbeams node was clamping the range of influence to start at 1
pixel distance from the source. This was a poor fix for artifacts caused
by an off set in buffer coordinates. Since the u coordinate starts at
ceil(umax) the v coordinate also has to use ceil. This also fixes some
discontinuities that became visible when the source point is close to
a sharp line in the input image.
For now it was mainly about OpenCL wrangler being duplicated
between Cycles and Compositor, but with OpenSubdiv work those
wranglers were gonna to be duplicated just once again.
This commit makes it so Cycles and Compositor uses wranglers
from this repositories:
- https://github.com/CudaWrangler/cuew
- https://github.com/OpenCLWrangler/clew
This repositories are based on the wranglers we used before
and they'll be likely continued maintaining by us plus some
more players in the market.
Pretty much straightforward change with some tricks in the
CMake/SCons to make this libs being passed to the linker
after all other libraries in order to make OpenSubdiv linked
against those wranglers in the future.
For those who're worrying about Cycles being less standalone,
it's not truth, it's rather more flexible now and in the future
different wranglers might be used in Cycles. For now it'll
just mean those libs would need to be put into Cycles repository
together with some other libs from Blender such as mikkspace.
This is mainly platform maintenance commit, should not be any
changes to the user space.
Reviewers: juicyfruit, dingto, campbellbarton
Reviewed By: juicyfruit, dingto, campbellbarton
Differential Revision: https://developer.blender.org/D707
This allows adding a "fake" sun beam effect, simulating crepuscular rays
from light being scattered in a medium like the atmosphere or deep water.
Such effects can be created also by renderers using volumetric lighting,
but the compositor feature is a lot cheaper and is independent from 3D
rendering. This makes it ideally suited for motion graphics.
The implementation uses am optimized accumulation method for gathering
color values along a line segment. The inner buffer loop uses fixed
offset increments to avoid unnecessary multiplications and avoids
variables by using compile-time specialization (see inline comments
for further details).
Proxy operations from muted nodes would still create conversion
operations where the datatypes don't match, which creates unexpected
behavior. Arguably datatype conversion could still happen even when the
main operation is muted, but this would be a design change and so is
disabled now.
This gives around 30% of speedup for gaussian blur node.
Pretty much straightforward implementation inside the node
itself, but needed to implement some additional things:
- Aligned malloc. It's needed to load data onto SSE registers
faster. based on the aligned_malloc() from Libmv with
some additional trickery going on to support arbitrary
alignment (this magic is needed because of MemHead).
In the practice only 16bit alignment is supported because
of the lack of aligned malloc with arbitrary alignment
for OSX. Not a bit deal for now because we need 16 bytes
alignment at this moment only. Could be tweaked further
later.
- Memory buffers in compositor are now aligned to 16 bytes.
Should be harmless for non-SSE cases too. just mentioning.
Reviewers: campbellbarton, lukastoenne, jbakker
Reviewed By: campbellbarton
CC: lockal
Differential Revision: https://developer.blender.org/D564
The node was using sampler from the callee node and passed
it to the input nodes. Since the fact that compositor output
node uses NEAREST interpolation (why it uses nearest is the
whole separate story) it's not possible to have subpixel
precision in such cases:
<image> -> <translate> -> <output>
For now solving by hard-coding translate node to use BILINEAR
interpolation. It can't become worse in this node anyway and
the sampling pipeline is to be re-visited from scratch.
This commit pretty much reverts all the changes related on tile-ability
of the fast gaussian blur. It's not tilable by definition and would almost
always give you seams on the tile boundaries.
Atmind already met the issue and tried to solve it by increasing some
magic constant, which is pretty much likely simply made it so compositor
switched to full-frame calculation in that particular .blend file.
Fast gaussian is really not a production thing and need to be avoided.
We're to improve speed of normal gaussian blur instead.
The formula was not consistent across Blender and behaved strangely, now it is
a simple linear blend between color1 and min(color1, color2).
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D489
This issue is because of a somewhat "special" behavior in old code, which got lost during rB09874df:
There was a variant of the `relinkConnections` function which would leave the socket completely unconnected. This is not a valid state really (given that each unconnected input must otherwise connected to a constant `Set` type node), but was used as a way to distinguish connected alpha/depth sockets in composite and viewer output nodes.
https://developer.blender.org/diffusion/B/browse/master/source/blender/compositor/intern/COM_InputSocket.cpp;28a829893c702918afc5ac1945a06eaefa611594$69
After the large cleanup patch ({D309}) every socket is now automatically connected to a constant, such that `getInputSocketReader` will never return a NULL pointer. This breaks the previous test method, which needs to be replaced by more explicit flags. Luckily this was done only for very few output nodes (Composite, Viewer, Output-File). These now use the regular SetValueOperation default in case "use alpha" is disabled, but set this to an explicit 1.0 value instead of mapping to the node socket.
Many parts of the compositor are unnecessarily complicated. This patch
aims at reducing the complexity of writing nodes and making the code
more transparent.
== Separating Nodes and Operations ==
Currently these are both mixed in the same graph, even though they have
very different purposes and are used at distinct stages in the
compositing process. The patch introduces dedicated graph classes for
nodes and for operations.
This removes the need for a lot of special case checks (isOperation etc.)
and explicit type casts. It simplifies the code since it becomes clear
at every stage what type of node we are dealing with. The compiler can
use static typing to avoid common bugs from mixing up these types and
fewer runtime sanity checks are needed.
== Simplified Node Conversion ==
Converting nodes to operations was previously based on "relinking", i.e.
nodes would start with by mirroring links in the Blender DNA node trees,
then add operations and redirect these links to them. This was very hard
to follow in many cases and required a lot of attention to avoid invalid
states.
Now there is a helper class called the NodeConverter, which is passed to
nodes and implements a much simpler API for this process. Nodes can add
operations and explicit connections as before, but defining "external"
links to the inputs/outputs of the original node now uses mapping
instead of directly modifying link data. Input data (node graph) and
result (operations graph) are cleanly separated.
== Removed Redundant Data Structures ==
A few redundant data structures have been removed, notably the
SocketConnection. These are only needed temporarily during graph
construction. For executing the compositor operations it is perfectly
sufficient to store only the direct input link pointers. A common
pointer indirection is avoided this way (which might also give a little
performance improvement).
== Avoid virtual recursive functions ==
Recursive virtual functions are evil. They are very hard to follow
during debugging. At least in the parts this patch is concerned with
these functions have been replaced by a non-virtual recursive core
function (which might then call virtual non-recursive functions if
needed). See for example NodeOperationBuilder::group_operations.
which tiles are selected for input there was always a constant for correcting
the accuracy.
It seems that the constant was not enough and has been adjusted. (2 => 3).