Realtime Compositor: Implement Anisotropic Kuwahara #110786
No reviewers
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
7 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#110786
Loading…
Reference in New Issue
No description provided.
Delete Branch "OmarEmaraDev/blender:gpu-anisotropic-kuwahara"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This patch implements the Anisotropic Kuwahara filter for the Realtime
compositor. The implementation is based on three papers on Anisotropic Kuwahara
filtering, presented and detailed in the code. The implementation is different
from the existing CPU implementation, but is a higher quality one that is also
an order of magnitude faster and conforms to the methods described in the paper.
The new implementation exposes two extra parameters that control the sharpness
and directionality of the output, giving more artistic freedom.
A comparison between an original image, the existing CPU algorithm, and the new GPU algorithm:
A comparison between the differenet sharpness levels:
A comparison between the different eccentricity levels:
Another comparison between an original image, the existing CPU algorithm, and the new GPU algorithm:
Iterative application of filter:
Temporal stability:
WIP: Realtime Compositor: Implement Anisotropic Kuwaharato Realtime Compositor: Implement Anisotropic Kuwahara@blender-bot package
Package build started. Download here when ready.
@ -996,6 +996,8 @@ typedef struct NodeKuwaharaData {
short size;
short variation;
int smoothing;
I think sharpness and smoothing are the same thing from a user's perspective. Ideally we shouldn't duplicate it, even if GPU and CPU implementations are different.
They are not really the same thing though, smoothing controls the homogeneity of the directions of neighbouring kuwahara sectors, while sharpness controls the sharpness of the sectors themselves. And to clarify, the GPU implementation uses both.
However, I do agree that we need to clarify the names, smoothing can probably be renamed to something that reflects its function more.
@ -24,0 +31,4 @@
GPU_SHADER_CREATE_INFO(compositor_kuwahara_anisotropic)
.local_group_size(16, 16)
.push_constant(Type::INT, "radius")
radius
is actually calledsize
in UI, right? I think we should unify naming here.radius
is also a fitting name for the classic variation so I don't mind renaming the UI parameter to radius.Size is used throughout filter nodes to denote filter window radius, so it is probably okay to leave it as Size for consistency.
Is it possible to rename radius to size in code then? I found it a bit hard to navigate code if variables get renamed from UI to implementation
Radius makes the code clearer and is what the paper uses, so it is a necessary evil I think.
@ -24,0 +32,4 @@
GPU_SHADER_CREATE_INFO(compositor_kuwahara_anisotropic)
.local_group_size(16, 16)
.push_constant(Type::INT, "radius")
.push_constant(Type::FLOAT, "eccentricity")
For me it's a bit hard to guess the effect of the parameter without reading documentation. Is
preserve edge
a better name maybe?You mean eccentricity? Preserve Edge is a bit misleading in that case, because as its documentation says, it defines how directional the filter is. Googling eccentricity gives good visualizations, so I think it is fine personally.
Yes I meant eccentricity sorry. I think I understand what it means, I was just thinking how hard it would be from a user's perspective. I'm ok with the naming if most users can understand the effect of the parameter from reading the tip tool.
Yes I meant eccentricity sorry. I think I understand what it means, I was just thinking how hard it would be from a user's perspective. I'm ok with the naming if most users can understand the effect of the parameter from reading the tip tool.
@ -0,0 +150,4 @@
vec2 rotated_disk_point = M_SQRT1_2 *
vec2(disk_point.x - disk_point.y, disk_point.x + disk_point.y);
/* Finally, we compute the other every other 4 weights starting from the 45 degreed rotated
typo?
I will reword that.
@blender-bot package
Package build started. Download here when ready.
@OmarEmaraDev Results actually look very good. Your solution also solves a few issues we have with CPU implementation. Nicely done! :)
I did a few tests and looked at the code and have a few comments:
radius
to 1, I see some black artefacts in the image. I had this issue with CPU implementation and currently it's solved by adding an offset to the user input. This way, the user can't input a wrong value.@blender-bot package
Package build started. Download here when ready.
Nice presentation and well documented code! I really like the results of the GPU implementation. Testing on M2 MacBook is waaay faster.
While the code and functionality of the GPU side all looks good to me, it would be really nice to align CPU implementation to the same algorithm. It is kind of unideal to leave parameters exposed which do not have affect outside of an experimental feature set.
@OmarEmaraDev Is it something you look into already, or can help with?
I asked about this on the Dev Forum and got the answer that the CPU implementation will be updated to match after this patch gets approved.
https://devtalk.blender.org/t/real-time-compositor-feedback-and-discussion/25018/427
@Sergey Yes, definitely. I was just waiting for this patch to get approved, and I will follow it with a patch for the CPU compositor just like we did for classic Kuwahara.
@OmarEmaraDev It is a bit of chicken and egg problem then :( If patch is approved then it means it can land. But landing code which exposes parameters which do not have affect in a default configuration is not something we do.
I don't think there are some major algorithmical changes which would need to be done to the GPU integration. There is tweak to be done for the radius of 1, but other than that I think all of us are happy with the results. Maybe you can work on a CPU implementation as part of this PR now? That would avoid situation when we violate UI/UX topics.
Just trying to find a way forward which keeps UI/UX principles we follow, but also keep all us happy :)
@Sergey Alright, I will update the pull request with the CPU implementation.
@blender-bot package
Package build started. Download here when ready.
@blender-bot package
Package build started. Download here when ready.
From testing and reading the code did not see anything to wrong.
P.S. Would be nice to somehow avoid duplication, but that's a separate known topic, not tor this PR.
Some observations from testing:
radius = 1
is solved, thanks :)Sharpness
andEccentricity
labels are cropped (see screenshot) after switching to anisotropic variation, so it looks like a typo.The overall patch still looks very good to me, so not sure my comments should be considered blocking...
@ -0,0 +1,40 @@
#pragma BLENDER_REQUIRE(gpu_shader_compositor_texture_utilities.glsl)
Looking at the CPU implementation made me wonder, isn't this a more general (approximation of) structure tensor which is not specific to Kuwahara filter? So is there a reason the implementation is not in
blender/compositor/realtime_compositor/algorithms
?Maybe, but that could be done when it is really needed.
@ -149,0 +237,4 @@
*
* Since the anisotropy is in the [0, 1] range, the factor tends to 1 as the eccentricity tends
* to infinity and tends to infinity when the eccentricity tends to zero. The stored eccentricity
* is in the range [0, 2], we map that to the range [infinity, 0.5] by taking the reciprocal,
Since the value gets mapped anyways, does it make sense to give the user the range
[0, 1]
and usePROP_FACTOR
for UI instead?We do already use
PROP_FACTOR
, but there is a good reason why I made the UI range[0, 2]
, because the maximum2
doubles the computed eccentricity, while the minimum0
zeros it. And 1 is an identity and leaves the computed eccentricity as is.True,
PROP_FACTOR
is already used. I doubt it's how users perceive it though, most artists probably would just see it as a slider with min and max value. But again, UI questions are always subjective, so it's fine for me if usage is clear to users :)Only some
const
correctness and avoidingpow(x, 2)
which caused some issues in the past.@ -101,0 +126,4 @@
/* Compute the overlap polynomial parameters for 8-sector ellipse based on the equations in
* section "3 Alternative Weighting Functions" of the polynomial weights paper. More on this
* later in the code. */
int number_of_sectors = 8;
Use
const
.@ -0,0 +30,4 @@
/* Compute the first and second eigenvalues of the structure tensor using the equations in
* section "3.1 Orientation and Anisotropy Estimation" of the paper. */
float eigenvalue_first_term = (dxdx + dydy) / 2.0;
float eigenvalue_square_root_term = sqrt(pow(dxdx - dydy, 2.0) + 4.0 * pow(dxdy, 2.0)) / 2.0;
Do not use
pow(x, 2)
. Usepow2f
orsquare_f
. You can introduce a version for vectors.@ -0,0 +87,4 @@
/* Compute the overlap polynomial parameters for 8-sector ellipse based on the equations in
* section "3 Alternative Weighting Functions" of the polynomial weights paper. More on this
* later in the code. */
int number_of_sectors = 8;
Use const.
@ -0,0 +14,4 @@
/* The weight kernels of the filter optimized for rotational symmetry described in section "3.2.1
* Gradient Calculation". */
float corner_weight = 0.182;
Use const. On both.
@ -0,0 +33,4 @@
/* We encode the structure tensor in a vec4 using a column major storage order. */
vec4 structure_tensor = vec4(dot(x_partial_derivative, x_partial_derivative),
dot(x_partial_derivative, y_partial_derivative),
Here you take twice the same dot product
dot(x_partial_derivative, y_partial_derivative)
. I guess it is just because the dot product is commutative. Still a bit confusing thought. Also that raises the question as to why store twice the same value. Maybe having a R16F + a RG16F texture is a better choice here. But if the textures are not used anywhere it might not be beneficial.That's correct. I just stored the extra value for clarity since that's what the matrix contains since I found it not worth it to use two R16F + a RG16F textures for that.
Fine by me. Maybe use
square_f
on CPU code too. I don't know about how wellpow
is optimized theses days.@blender-bot build
I'm seeing more than a 100x (!) speedup compared to CPU on high resolutions and high kernel sizes in particular. I've gone from having to process individual frames from the command line, maxing out a 24-core CPU for minutes at a time, to working interactively. Wild.
The fixed kernel size is a bit of a creative limitation. For example, the paper "Oil Painting Style Rendering Based on Kuwahara Filter" uses generated saliency image segmentation as input to vary the kernel size around areas of interest, which makes a huge difference. The saliency segmentation is a different beast, but feeding the existing Kuwahara node with a depth map, a vertex-painted detail map AOV, or even a custom roto matte or similar could be very powerful.
This might not be practical with the particular implementation being used, and it seems like this is wrapping up anyway. However, I didn't see any discussion of variable kernel size as a feature, so I thought I'd point it out. If it's as easy as flicking a switch, it might be worth looking into.
Just for reference, I simulated the effect here by manually blending between kernel sizes.
@DGruwier To clarify, you just want the radius to be exposed as an input, not the saliency map generation or the other methods in the paper, correct?
I haven't read the full paper yet, but that shouldn't be hard as far as can tell, except maybe for performance implications which we can avoid through specializations. I could make a test patch next week.
@OmarEmaraDev Right, exposing the radius as an input, nothing more.
I only mentioned the paper in reference to the idea of varying the kernel size across the image.