Compositor: CPU vs GPU Differences #118548

Open
opened 2024-02-21 09:55:35 +01:00 by Omar Emara · 0 comments
Member

Each of the following sections describe one difference between the CPU and GPU compositors, its
problem, its potential solution, the nodes it affects, and the failed regression tests where it
manifests.

Anisotropic Filtering

Problem

GPU uses the hardware anisotropic filtering capabilities of the GPU. Since each vendor, GPU, and
driver might have a different implementation, we can't really unify the implementations.

Solution

Reimplement our own anisotropic filter.

Affected Nodes

This affects the following nodes:

  • Corner Pin.
  • Displace.
  • Map UV.
  • Plane Track.

Failed Tests

  • node corner pin.
  • node displace.
  • node map u v.
  • node plane track image.
  • node plane track motion.

Jitter Anti-Aliasing (blender/blender!118853)

Problem

CPU uses an 8-sample jitter multi-sample anti-aliasing algorithm to anti-alias some masks. GPU
compositor uses bilinear interpolation with zero boundary to achieve anti-aliasing.

Solution

Use SMAA in all compositor operations that require anti-aliasing.

Affected Nodes

  • Corner Pin.
  • Plane Track.

Failed Tests

  • node corner pin plane.
  • node plane track motion plane.
  • node plane track plane.

SMAA (blender/blender!119414)

Problem

SMAA operation produce different results between CPU and GPU. GPU uses the original SMAA library,
CPU uses a C++ port of the library.

Solution

No solution known yet. Needs investigation to figure out where the difference originates.

Affected Nodes

  • Anti-Alias.
  • ID Mask.
  • Z Combine.
  • Dilate.

Failed Tests

  • node dilate threshold.

Fast Gaussian

Problem

GPU does not implement the Fast Gaussian blur mode and falls back to the slow convolution algorithm.
But, CPU implementation seems broken and could better match standard Gaussian for low dynamic range
images. Further, CPU sometimes uses Fast Gaussian as a utility in other operations, like the Glare
node, while a normal blur should be used instead since the blur radius is very small.

Solution

Implement Fast Gaussian for GPU, port the same implementation to CPU.

Affected Nodes

  • Blur.

Failed Tests

  • node blur fast gaussian.
  • compositor-nodes-desintegrate-wipe-01.

Curve Maps (Indirectly fixed by blender/blender!118624)

Problem

CPU and GPU evaluates curve maps slightly differently, which can produce a tiny difference in the
0.001 magnitude. Further, curve maps are stored as half floats in the GPU, so we lose some
precision, making the difference worse.

Solution

Don't use the hardware sampler and interpolate the curve maps manually. Use full float for curve
maps.

Affected Nodes

  • RGB Curves.
  • Vector Curves.
  • Color Ramp.

Failed Tests

  • node curve vec.

Clipped Transformations (blender/blender!119278)

Problem

The result of some transformations is clipped in CPU, but only in background rendering, so this is a
bug in the CPU implementation rather than a difference.

Solution

Unknown, still needs investigation.

Affected Nodes

  • Flip.
  • Scale.
  • Rotate.
  • Transform.
  • Translate.

Failed Tests

  • flip scalerotate.

Linear Space Color Conversion (blender/blender!118624)

Problem

CPU produces slightly different values after sRGB to Linear color transformations, for instance,
pixels of value 1.0 becomes 1.00002408. According to Sergey:

The sRGB transform uses ExponentWithLinearTransform, which internally does SSE SIMD for pow, and that might not be as precise as per-component powf from libc. #110895

Solution

Do the color space conversion on the CPU using the same OIIO method. Or move to a more accurate method on the CPU.

Affected Nodes

  • Image.
  • Movie Clip.

Failed Tests

  • node keying matte.
  • node chroma matte.
  • distorted bw key.

Legacy Cryptomatte (blender/blender!118570)

Problem

GPU doesn't have an implementation.

Solution

Unclear what this node represents or if it will be removed. Just do an implementation regardless.

Affected Nodes

  • Legacy Cryptomatte.

Failed Tests

  • node cryptomatte legacy.

Fog Glare

Problem

CPU convolves the image with an arbitrary point spread function. GPU uses a Bloom algorithm.

Solution

Move Bloom as to separate option, port implementation to CPU. Add an FFT implementation for GPU and
use that to do the convolution, getting rid of the arbitrary PSF for both CPU and GPU.

Affected Nodes

  • Glare.

Failed Tests

  • node glare fog glow.
  • Fire2.

Denoise (blender/blender!118553)

Problem

CPU always adds the normal and albedo passes even if they are not connected, and inflates their
values to full buffers.

Solution

Only add the normal and albedo passes if they are actually connected and are full buffers.

Affected Nodes

  • Denoise.

Failed tests

  • denoise.

Interpolation

Problem

CPU and GPU interpolate images differently, but it is likely that the GPU is more correct.

Solution

Unknown, needs more investigation.

Affected Nodes

  • Corner Pin.
  • Displace.
  • Lens Distortion.
  • Map UV.
  • Movie Distortion.
  • Plane Track.
  • Rotate.
  • Translate.
  • Scale.
  • Transform.
  • Stabilize 2D.
  • Directional Blur.
  • Sun Beams.

Failed Tests

  • node corner pin.
  • node displace.
  • node lens distortion negative.
  • node lens distortion positive.
  • node map u v.
  • node movie distortion distort.
  • node movie distortion undistort.
  • node plane track image.
  • node plane track motion image.
  • node rotate.
  • node scale.
  • node stabilize 2d.
  • node stabilize 2d invert.
  • node transform.
  • node translate.
  • node d blur.
  • node sun beams.
  • Fire2.
  • compositor-nodes-desintegrate-wipe-01.
  • distorted bw key.
  • flip scalerotate.

Vector Blur (blender/blender#120135)

Problem

GPU uses an implementation similar to EEVEE, while CPU uses a more compute intensive operation.

Solution

Formalize GPU implementation and port it to CPU.

Affected Nodes

  • Vector Blur.

Failed Tests

  • node vector blur.

Variable Transformations (blender/blender#120314)

Problem

CPU allow the inputs of transform nodes to be variable, GPU does not.

Solution

Allow variable transformations for GPU.

Affected Nodes

  • Rotate.
  • Translate.
  • Scale.
  • Transform.

Failed Tests

  • Fire2.
Each of the following sections describe one difference between the CPU and GPU compositors, its problem, its potential solution, the nodes it affects, and the failed regression tests where it manifests. ## Anisotropic Filtering #### Problem GPU uses the hardware anisotropic filtering capabilities of the GPU. Since each vendor, GPU, and driver might have a different implementation, we can't really unify the implementations. #### Solution Reimplement our own anisotropic filter. #### Affected Nodes This affects the following nodes: - Corner Pin. - Displace. - Map UV. - Plane Track. #### Failed Tests - node corner pin. - node displace. - node map u v. - node plane track image. - node plane track motion. ## Jitter Anti-Aliasing (blender/blender!118853) #### Problem CPU uses an 8-sample jitter multi-sample anti-aliasing algorithm to anti-alias some masks. GPU compositor uses bilinear interpolation with zero boundary to achieve anti-aliasing. #### Solution Use SMAA in all compositor operations that require anti-aliasing. #### Affected Nodes - Corner Pin. - Plane Track. #### Failed Tests - node corner pin plane. - node plane track motion plane. - node plane track plane. ## SMAA (blender/blender!119414) #### Problem SMAA operation produce different results between CPU and GPU. GPU uses the original SMAA library, CPU uses a C++ port of the library. #### Solution No solution known yet. Needs investigation to figure out where the difference originates. #### Affected Nodes - Anti-Alias. - ID Mask. - Z Combine. - Dilate. #### Failed Tests - node dilate threshold. ## Fast Gaussian #### Problem GPU does not implement the Fast Gaussian blur mode and falls back to the slow convolution algorithm. But, CPU implementation seems broken and could better match standard Gaussian for low dynamic range images. Further, CPU sometimes uses Fast Gaussian as a utility in other operations, like the Glare node, while a normal blur should be used instead since the blur radius is very small. #### Solution Implement Fast Gaussian for GPU, port the same implementation to CPU. #### Affected Nodes - Blur. #### Failed Tests - node blur fast gaussian. - compositor-nodes-desintegrate-wipe-01. ## Curve Maps (Indirectly fixed by blender/blender!118624) #### Problem CPU and GPU evaluates curve maps slightly differently, which can produce a tiny difference in the 0.001 magnitude. Further, curve maps are stored as half floats in the GPU, so we lose some precision, making the difference worse. #### Solution Don't use the hardware sampler and interpolate the curve maps manually. Use full float for curve maps. #### Affected Nodes - RGB Curves. - Vector Curves. - Color Ramp. #### Failed Tests - node curve vec. ## Clipped Transformations (blender/blender!119278) #### Problem The result of some transformations is clipped in CPU, but only in background rendering, so this is a bug in the CPU implementation rather than a difference. #### Solution Unknown, still needs investigation. #### Affected Nodes - Flip. - Scale. - Rotate. - Transform. - Translate. #### Failed Tests - flip scalerotate. ## Linear Space Color Conversion (blender/blender!118624) #### Problem CPU produces slightly different values after sRGB to Linear color transformations, for instance, pixels of value 1.0 becomes 1.00002408. According to Sergey: > The sRGB transform uses `ExponentWithLinearTransform`, which internally does SSE SIMD for `pow`, and that might not be as precise as per-component `powf` from libc. https://projects.blender.org/blender/blender/pulls/110895 #### Solution Do the color space conversion on the CPU using the same OIIO method. Or move to a more accurate method on the CPU. #### Affected Nodes - Image. - Movie Clip. #### Failed Tests - node keying matte. - node chroma matte. - distorted bw key. ## Legacy Cryptomatte (blender/blender!118570) #### Problem GPU doesn't have an implementation. #### Solution Unclear what this node represents or if it will be removed. Just do an implementation regardless. #### Affected Nodes - Legacy Cryptomatte. #### Failed Tests - node cryptomatte legacy. ## Fog Glare #### Problem CPU convolves the image with an arbitrary point spread function. GPU uses a Bloom algorithm. #### Solution Move Bloom as to separate option, port implementation to CPU. Add an FFT implementation for GPU and use that to do the convolution, getting rid of the arbitrary PSF for both CPU and GPU. #### Affected Nodes - Glare. #### Failed Tests - node glare fog glow. - Fire2. ## Denoise (blender/blender!118553) #### Problem CPU always adds the normal and albedo passes even if they are not connected, and inflates their values to full buffers. #### Solution Only add the normal and albedo passes if they are actually connected and are full buffers. #### Affected Nodes - Denoise. #### Failed tests - denoise. ## Interpolation #### Problem CPU and GPU interpolate images differently, but it is likely that the GPU is more correct. #### Solution Unknown, needs more investigation. #### Affected Nodes - Corner Pin. - Displace. - Lens Distortion. - Map UV. - Movie Distortion. - Plane Track. - Rotate. - Translate. - Scale. - Transform. - Stabilize 2D. - Directional Blur. - Sun Beams. #### Failed Tests - node corner pin. - node displace. - node lens distortion negative. - node lens distortion positive. - node map u v. - node movie distortion distort. - node movie distortion undistort. - node plane track image. - node plane track motion image. - node rotate. - node scale. - node stabilize 2d. - node stabilize 2d invert. - node transform. - node translate. - node d blur. - node sun beams. - Fire2. - compositor-nodes-desintegrate-wipe-01. - distorted bw key. - flip scalerotate. ## Vector Blur (blender/blender#120135) #### Problem GPU uses an implementation similar to EEVEE, while CPU uses a more compute intensive operation. #### Solution Formalize GPU implementation and port it to CPU. #### Affected Nodes - Vector Blur. #### Failed Tests - node vector blur. ## Variable Transformations (blender/blender#120314) #### Problem CPU allow the inputs of transform nodes to be variable, GPU does not. #### Solution Allow variable transformations for GPU. #### Affected Nodes - Rotate. - Translate. - Scale. - Transform. #### Failed Tests - Fire2.
Omar Emara added the
Interest
Compositing
Module
VFX & Video
Type
To Do
labels 2024-02-21 09:55:35 +01:00
Sergey Sharybin added this to the Compositing project 2024-03-05 15:14:06 +01:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#118548
No description provided.