The title says it all, move the EWA filter to BLI (currently it's
math_interp.c) and use the function from both BI renderer and the
compositor.
This makes more central place of the algorithm, allowing to have
fixes and optimizaitons synchronized across the two usages.
This also fixes T41440: Displacement in compositing creates holes
Reviewers: campbellbarton, lukastoenne
Reviewed By: lukastoenne
Maniphest Tasks: T41440
Differential Revision: https://developer.blender.org/D748
For now it was mainly about OpenCL wrangler being duplicated
between Cycles and Compositor, but with OpenSubdiv work those
wranglers were gonna to be duplicated just once again.
This commit makes it so Cycles and Compositor uses wranglers
from this repositories:
- https://github.com/CudaWrangler/cuew
- https://github.com/OpenCLWrangler/clew
This repositories are based on the wranglers we used before
and they'll be likely continued maintaining by us plus some
more players in the market.
Pretty much straightforward change with some tricks in the
CMake/SCons to make this libs being passed to the linker
after all other libraries in order to make OpenSubdiv linked
against those wranglers in the future.
For those who're worrying about Cycles being less standalone,
it's not truth, it's rather more flexible now and in the future
different wranglers might be used in Cycles. For now it'll
just mean those libs would need to be put into Cycles repository
together with some other libs from Blender such as mikkspace.
This is mainly platform maintenance commit, should not be any
changes to the user space.
Reviewers: juicyfruit, dingto, campbellbarton
Reviewed By: juicyfruit, dingto, campbellbarton
Differential Revision: https://developer.blender.org/D707
This allows adding a "fake" sun beam effect, simulating crepuscular rays
from light being scattered in a medium like the atmosphere or deep water.
Such effects can be created also by renderers using volumetric lighting,
but the compositor feature is a lot cheaper and is independent from 3D
rendering. This makes it ideally suited for motion graphics.
The implementation uses am optimized accumulation method for gathering
color values along a line segment. The inner buffer loop uses fixed
offset increments to avoid unnecessary multiplications and avoids
variables by using compile-time specialization (see inline comments
for further details).
Proxy operations from muted nodes would still create conversion
operations where the datatypes don't match, which creates unexpected
behavior. Arguably datatype conversion could still happen even when the
main operation is muted, but this would be a design change and so is
disabled now.
Viewers were activated both inside the active group as well as the top
level tree (the latter being a quick fix for getting a fallback viewer).
This caused a race condition on the shared viewer image.
Now the active viewer is defined at node conversion time in the converter
so that only one can be active at a time without each node having to
follow complicated rules for exclusion.
Current temporary data of Blender suffers one major issue - default 'temp' dir on Windows is never
automatically cleaned up, and can end being quite big when used by Blender, especially when we have
to store per-process data (using getpid() in file names).
To address this, this patch:
* Divides tempdir paths in two, one for 'base' temp dir (the same as previous unique tempdir path),
the other is a mkdtemp-generated sub-dir, specific to each Blender instance.
* Only uses base tempdir when we need some shallow persistance accross Blender sessions - and we always
reuse the same filename (quit.blend...) or generate small file (crash reports...).
* Uses temp sub-dir for heavy files like pointcache or renderEXRs (Save Buffer option).
* Erases temp sub-dir on quit or crash.
To get this working it also adds a working 'recursive delete' to BLI_delete() under Windows.
Note that, as in current code, the 'recover render result' hack-feature that was possible
with SaveBuffer option is still removed. A real renderresult cache feature will be added
soon, though.
Reviewers: campbellbarton, brecht, sergey
Reviewed By: campbellbarton, sergey
CC: sergey
Differential Revision: https://developer.blender.org/D531
This gives around 30% of speedup for gaussian blur node.
Pretty much straightforward implementation inside the node
itself, but needed to implement some additional things:
- Aligned malloc. It's needed to load data onto SSE registers
faster. based on the aligned_malloc() from Libmv with
some additional trickery going on to support arbitrary
alignment (this magic is needed because of MemHead).
In the practice only 16bit alignment is supported because
of the lack of aligned malloc with arbitrary alignment
for OSX. Not a bit deal for now because we need 16 bytes
alignment at this moment only. Could be tweaked further
later.
- Memory buffers in compositor are now aligned to 16 bytes.
Should be harmless for non-SSE cases too. just mentioning.
Reviewers: campbellbarton, lukastoenne, jbakker
Reviewed By: campbellbarton
CC: lockal
Differential Revision: https://developer.blender.org/D564
A node group can have multiple input nodes. In the compositor that means
each of the input sockets has to be connected to the linked outputs,
which is represented by a single link on the outside of the group.
This issue is because of a somewhat "special" behavior in old code, which got lost during rB09874df:
There was a variant of the `relinkConnections` function which would leave the socket completely unconnected. This is not a valid state really (given that each unconnected input must otherwise connected to a constant `Set` type node), but was used as a way to distinguish connected alpha/depth sockets in composite and viewer output nodes.
https://developer.blender.org/diffusion/B/browse/master/source/blender/compositor/intern/COM_InputSocket.cpp;28a829893c702918afc5ac1945a06eaefa611594$69
After the large cleanup patch ({D309}) every socket is now automatically connected to a constant, such that `getInputSocketReader` will never return a NULL pointer. This breaks the previous test method, which needs to be replaced by more explicit flags. Luckily this was done only for very few output nodes (Composite, Viewer, Output-File). These now use the regular SetValueOperation default in case "use alpha" is disabled, but set this to an explicit 1.0 value instead of mapping to the node socket.
Many parts of the compositor are unnecessarily complicated. This patch
aims at reducing the complexity of writing nodes and making the code
more transparent.
== Separating Nodes and Operations ==
Currently these are both mixed in the same graph, even though they have
very different purposes and are used at distinct stages in the
compositing process. The patch introduces dedicated graph classes for
nodes and for operations.
This removes the need for a lot of special case checks (isOperation etc.)
and explicit type casts. It simplifies the code since it becomes clear
at every stage what type of node we are dealing with. The compiler can
use static typing to avoid common bugs from mixing up these types and
fewer runtime sanity checks are needed.
== Simplified Node Conversion ==
Converting nodes to operations was previously based on "relinking", i.e.
nodes would start with by mirroring links in the Blender DNA node trees,
then add operations and redirect these links to them. This was very hard
to follow in many cases and required a lot of attention to avoid invalid
states.
Now there is a helper class called the NodeConverter, which is passed to
nodes and implements a much simpler API for this process. Nodes can add
operations and explicit connections as before, but defining "external"
links to the inputs/outputs of the original node now uses mapping
instead of directly modifying link data. Input data (node graph) and
result (operations graph) are cleanly separated.
== Removed Redundant Data Structures ==
A few redundant data structures have been removed, notably the
SocketConnection. These are only needed temporarily during graph
construction. For executing the compositor operations it is perfectly
sufficient to store only the direct input link pointers. A common
pointer indirection is avoided this way (which might also give a little
performance improvement).
== Avoid virtual recursive functions ==
Recursive virtual functions are evil. They are very hard to follow
during debugging. At least in the parts this patch is concerned with
these functions have been replaced by a non-virtual recursive core
function (which might then call virtual non-recursive functions if
needed). See for example NodeOperationBuilder::group_operations.
The issue was caused by the readEWA spending loads of time trying
to sample regions outside of the buffer.Solved by adding an early
exit check.
We could also clamp the sampling region to the rect, but it's
not so much clear whether weight will be correct in such case
so left it for the future.
This was suggested by Christopher Barrett (terrachild). Corner pin is a common feature in compositing.
The corners for the plane warping can be defined by using vector node inputs to allow using perspective plane transformations without having to go via the MovieClip editor tracking data.
Uses the same math as the PlaneTrack node, but without the link to MovieClip and Object.
{F78199}
The code for PlaneTrack operations has been restructured a bit to share it with the CornerPin node.
* PlaneDistortCommonOperation.h/.cpp: Shared generic code for warping images based on 4 plane corners and a perspective matrix generated from these. Contains operation base classes for both the WarpImage and Mask operations.
* PlaneTrackOperation.h/.cpp: Current plane track node operations, based on the common code above. These add pointers to MovieClip and Object which define the track data from wich to read the corners.
* PlaneCornerPinOperation.h/.cpp: New corner pin variant, using explicit input sockets for the plane corners.
One downside of the current compositor design is that there is no concept of invariables (constants) that don't vary over the image space. This has already been an issue for Blur nodes (size input is usually constant except when "variable size" is enabled) and a few others. For the corner pin node it is necessary that the corner input sockets are also invariant. They have to be evaluated for each tile now, otherwise the data is not available. This in turn makes it necessary to make the operation "complex" and request full input buffers, which adds unnecessary overhead.
nodes (Blur) causes crash due to chained read/write buffer operations.
The way read/write buffer operations are created for both the wrapped
translate node and then the "complex" blur node creates a chain of
buffers in the same ExecutionGroup. This leaves the later write buffer
operations without a proper "executor" group and fails on assert.
Solution for now is to check for existing output buffer operations like
it already happens for inputs. This is extremely ugly code, but should
become a lot more transparent after compositor cleanup ({D309}).
As discussed in T38340 the solution is to use the current scene from
context whenever feasible.
Composite does not use node->id at all now, the scene which owns the
compositing node tree is retrieved from context instead.
Defocus node->id is made editable by the user. By default it is not set,
which also will make it use the contextual scene and camera info.
The node->id pointer in Defocus is **not** cleared in older blend files.
This is done for backward compatibility: the node will then behave as
before in untouched scenes.
File Output nodes also don't store scene in node->id. This is only needed
when creating a new node for initializing the file format.
Reviewers: brecht, jbakker, mdewanchand
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D290
EWA sampling is designed for downsampling images, i.e. scaling down the size of
input image pixels, which happens regularly in compositing. While the standard
sampling methods (linear, cubic) work reasonably well for linear
transformations, they don't yield good results in non-linear cases like
perspective projection or arbitrary displacement. EWA sampling is comparable to
mipmapping, but avoids problems with discontinuities.
To work correctly the EWA algorithm needs partial derivatives of the mapping
functions which convert output pixel coordinates back into the input image
space (2x2 Jacobian matrix). With these derivatives the EWA algorithm
projects ellipses into the input space and accumulates colors over their
area. This calculation was not done correctly in the compositor, only the
derivatives du/dx and dv/dy were calculation, basically this means it only
worked for non-rotated input images.
The patch introduces full derivative calculations du/dx, du/dy, dv/dx, dv/dy for
the 3 nodes which use EWA sampling currently: PlaneTrackWarp, MapUV and
Displace. In addition the calculation of ellipsis area and axis-aligned
bounding boxes has been fixed.
For the MapUV and Displace nodes the derivatives have to be estimated by
evaluating the UV/displacement inputs with 1-pixel offsets, which can still have
problems on discontinuities and sub-pixel variations. These potential problems
can only be alleviated by more radical design changes in the compositor
functions, which are out of scope for now. Basically the values passed to the
UV/Displacement inputs would need to be associated with their 1st order
derivatives, which requires a general approach to derivatives in all nodes.
Changes for VC2013
Now, I can build Blender with VC2013 with Cycles, Collada, OpenExr,OpenImageIO disabled. Also, you need VC2008 sp1 installed to make old libs compatible.
Distinguish the 3 different methods for acquiring pixel color values (executePixel, executePixelSampled, executePixelFiltered).
This makes it easier to keep track of the different sampling methods (and works nicer with IDEs that do code parsing).
Differential Revision: http://developer.blender.org/D7
This was own error in r60049 which fixed chunk number calculation. This was mixing int and unsigned int values from ExecutionGroup, which leads to huge chunk numbers which are then skipped.
Debug code for graphviz output moved to a dedicated file COM_Debug.h/cpp.
The DebugInfo class has only static functions, which are called from a number of places to keep track of what is happening in the compositor. If debugging is disabled these are just inline stubs, so we
don't need #ifdefs everywhere and don't get any overhead.
The graphviz output is much more useful now. DebugInfo keeps track of node names in a static string map for meaningful names. It uses a number of colors for various special operation classes.
ExecutionGroups are indicated in graphviz with clusters.
Currently the graphviz .dot files are stored in the BLI_temporary_dir() folder. A separate dot file is generated for each stage of the ExecutionGroup scheduling, this is intended to give some idea of the
compositor progress, but could still be improved.
The chunk indices for scheduling chunks based on a given area were calculated incorrectly. This caused chunks at the very border of the render (pixels 256..257) to be omitted, leading to incorrect values
in the Z buffer of the test file, which in turn caused wrong normalization range and the resulting almost-white image.
Also added a dedicated executePixel function for Z buffer to avoid any interpolation of Z values.
- add missing headers from cmake (own omission)
- quiet rna_test.c unused define warnings.
- minor style edits
- spelling corrections and ignore all uppercase words with spell checking script.
It was caused by wrong copy-paste thing, which replaced check
"whether alpha channel is enabled" with "whether alpha channel
is not zero" (which is always zero in accumulator).
Compositor always works with RGBA, so no need to do any special
checks here.
TODO: Maybe MapUV ode shall ignore alpha channel?
This commit includes all the changes made for plane tracker
in tomato branch.
Movie clip editor changes:
- Artist might create a plane track out of multiple point
tracks which belongs to the same track (minimum amount of
point tracks is 4, maximum is not actually limited).
When new plane track is added, it's getting "tracked"
across all point tracks, which makes it stick to the same
plane point tracks belong to.
- After plane track was added, it need to be manually adjusted
in a way it covers feature one might to mask/replace.
General transform tools (G, R, S) or sliding corners with
a mouse could be sued for this. Plane corner which
corresponds to left bottom image corner has got X/Y axis
on it (red is for X axis, green for Y).
- Re-adjusting plane corners makes plane to be "re-tracked"
for the frames sequence between current frame and next
and previous keyframes.
- Kayframes might be removed from the plane, using Shit-X
(Marker Delete) operator. However, currently manual
re-adjustment or "re-track" trigger is needed.
Compositor changes:
- Added new node called Plane Track Deform.
- User selects which plane track to use (for this he need
to select movie clip datablock, object and track names).
- Node gets an image input, which need to be warped into
the plane.
- Node outputs:
* Input image warped into the plane.
* Plane, rasterized to a mask.
Masking changes:
- Mask points might be parented to a plane track, which
makes this point deforming in a way as if it belongs
to the tracked plane.
Some video tutorials are available:
- Coder video: http://www.youtube.com/watch?v=vISEwqNHqe4
- Artist video: https://vimeo.com/71727578
This is mine and Keir's holiday code project :)
- fix thumbnail preview (previously it showed only one input)
- make SplitViewer node update even if the second input is not connected
- now it works when the first socket is connected to a zero-sized node tree (e. g. Color Input node)
- SplitViewer node is now based on 2 operations: SplitOperation and ViewerOperation.
- ViewerBaseOperation was removed as a redundant one. Any future viewer style node can use the same principle and prepare the output before passing to an actual ViewerOperation.
Thanks Lukas Toenne for reviewing this patch and giving me get few pieces of advice.