VSE: make Glow effect 6x-10x faster #115818

Merged
Aras Pranckevicius merged 5 commits from aras_p/blender:vse-glow-opt into main 2023-12-06 19:39:51 +01:00

Glow effect was doing the correct thing algorithmically (separable gaussian blur), but it was 1) completely single-threaded, and 2) did operations in several passes over the source images, instead of doing them in one go. This PR:

  • Adds multi-threading to Glow effect.
  • Combines some operations, e.g. instead of IMB_buffer_float_from_byte followed by IMB_buffer_float_premultiply, do IMB_colormanagement_transform_from_byte_threaded which achieves the same, but more efficiently.
  • Simplifies the code: removing separate loops around image boundaries is both less code and slightly faster; use float4 vector type for more compact code; use Array classes instead of manual memory allocation, etc.
  • Removes IMB_buffer_float_unpremultiply and IMB_buffer_float_premultiply since they are no longer used by anything whatsoever.

Applying Glow to 4K UHD sequencer output, on Windows Ryzen 5950X:

  • Blur distance 4: 935ms -> 109ms (8.5x faster)
  • Blur distance 20: 3526ms -> 336ms (10.5x faster)

Same on Mac M1 Max:

  • Blur distance 4: 732ms -> 126ms (5.8x faster)
  • Blur distance 20: 3047ms -> 528ms (5.7x faster)

While doing all this, I noticed that Glow could also get fixes/improvements perhaps, but these are best discussed as a separate PR (this one is supposed to not change anything, just speed it up):

  • The gaussian kernel application seems to be off by one pixel, i.e. it's not actually centered.
  • The final glow result is clamped to max 1.0, even when applying the effect in float/HDR mode.
  • The "luminosity detection" for which parts should glow is a simple "R+G+B", instead of anything resembling actual luminosity calculations.
Glow effect was doing the correct thing algorithmically (separable gaussian blur), but it was 1) completely single-threaded, and 2) did operations in several passes over the source images, instead of doing them in one go. This PR: - Adds multi-threading to Glow effect. - Combines some operations, e.g. instead of `IMB_buffer_float_from_byte` followed by `IMB_buffer_float_premultiply`, do `IMB_colormanagement_transform_from_byte_threaded` which achieves the same, but more efficiently. - Simplifies the code: removing separate loops around image boundaries is both less code and slightly faster; use `float4` vector type for more compact code; use `Array` classes instead of manual memory allocation, etc. - Removes `IMB_buffer_float_unpremultiply` and `IMB_buffer_float_premultiply` since they are no longer used by anything whatsoever. Applying Glow to 4K UHD sequencer output, on Windows Ryzen 5950X: - Blur distance 4: 935ms -> 109ms (8.5x faster) - Blur distance 20: 3526ms -> 336ms (10.5x faster) Same on Mac M1 Max: - Blur distance 4: 732ms -> 126ms (5.8x faster) - Blur distance 20: 3047ms -> 528ms (5.7x faster) While doing all this, I noticed that Glow could also get fixes/improvements perhaps, but these are best discussed as a separate PR (this one is supposed to not change anything, just speed it up): - The gaussian kernel application seems to be off by one pixel, i.e. it's not actually centered. - The final glow result is clamped to max 1.0, even when applying the effect in float/HDR mode. - The "luminosity detection" for which parts should glow is a simple "R+G+B", instead of anything resembling actual luminosity calculations.
Aras Pranckevicius added 3 commits 2023-12-05 21:22:33 +01:00
Applying glow at 4K UHD resolution, on Windows Ryzen 5950X:
- distance 4: 935ms -> 136ms
- distance 20: 3524ms -> 365ms
Instead of doing preparation/finishing operations in separate passes
over the image, do a combined operation in one go. This also makes
IMB_buffer_float_unpremultiply and IMB_buffer_float_premultiply not
be used by anything, so remove.

Applying glow at 4K UHD resolution, on Windows Ryzen 5950X:
- distance 4: 136ms -> 122ms
- distance 20: 365ms -> 346ms
No performance difference observed
Aras Pranckevicius added 1 commit 2023-12-06 08:50:01 +01:00
VSE: simplify and speedup Glow some more
All checks were successful
buildbot/vexp-code-patch-coordinator Build done.
b695329bb9
Instead of applying blur kernel to "left + right side, followed by
middle", do much simpler thing and just apply it normally, taking care
of boundary conditions where kernel would step outside the image.

Also instead of doing "add glow to original image" in a separate pass
over the whole image, just add source when writing the final pixel.

Less code, and faster.

Applying glow at 4K UHD resolution, on Windows Ryzen 5950X:
- distance 4: 122ms -> 109ms
- distance 20: 346ms -> 336ms
Aras Pranckevicius changed title from WIP: VSE: speedup Glow effect to WIP: VSE: make Glow effect 10x faster 2023-12-06 08:53:39 +01:00
Author
Member

@blender-bot build

@blender-bot build
Aras Pranckevicius added this to the Video Sequencer project 2023-12-06 09:01:04 +01:00
Aras Pranckevicius changed title from WIP: VSE: make Glow effect 10x faster to VSE: make Glow effect 10x faster 2023-12-06 09:33:32 +01:00
Aras Pranckevicius requested review from Richard Antalik 2023-12-06 09:33:49 +01:00
Aras Pranckevicius added 1 commit 2023-12-06 11:05:32 +01:00
Aras Pranckevicius changed title from VSE: make Glow effect 10x faster to VSE: make Glow effect 6x-10x faster 2023-12-06 11:10:24 +01:00

This code looks very like Blur Attribute node (just with specific an neighbor set providing function).

This code looks very like `Blur Attribute` node (just with specific an neighbor set providing function).
Author
Member

This code looks very like Blur Attribute node (just with specific an neighbor set providing function).

Hmm not sure I see immediate similarity. From what I can tell, Blur Attribute node works by repeatedly smoothing the data. Which, given high enough iteration count, indeed approaches the gaussian distribution.

This code (and several other places inside Blender that apply gaussian kernel filtering to images) works by implementing gaussian blur more directly - in two 1D convolution passes (gaussian kernel is separable, that's why it can be done).

I could have tried to share more code between this and e.g. Gaussian Blur VSE effect that is in the same file even, but as mentioned in PR description, that would have been a (slight) behavior change which I wanted to avoid.

> This code looks very like `Blur Attribute` node (just with specific an neighbor set providing function). Hmm not sure I see immediate similarity. From what I can tell, Blur Attribute node works by repeatedly smoothing the data. Which, given high enough iteration count, indeed approaches the gaussian distribution. This code (and several other places inside Blender that apply gaussian kernel filtering to images) works by implementing gaussian blur more directly - in two 1D convolution passes (gaussian kernel is separable, that's why it can be done). I could have tried to share more code between this and e.g. Gaussian Blur VSE effect that is in the same file even, but as mentioned in PR description, that would have been a (slight) behavior change which I wanted to avoid.
Richard Antalik approved these changes 2023-12-06 18:16:47 +01:00
Richard Antalik left a comment
Member

Just a style nitpick: Comments should end with full stop.

Just a style nitpick: Comments should end with full stop.

As for another mentioned improvements, I would welcome these. Not sure how proper luminosity would affect output. If it is too much there could be concern about existing files - people are using glow for text outline for example. If it is minor change, it is question, whether it is worth fixing...

As for another mentioned improvements, I would welcome these. Not sure how proper luminosity would affect output. If it is too much there could be concern about existing files - people are using glow for text outline for example. If it is minor change, it is question, whether it is worth fixing...
Aras Pranckevicius merged commit fc64f48682 into main 2023-12-06 19:39:51 +01:00
Aras Pranckevicius deleted branch vse-glow-opt 2023-12-06 19:39:53 +01:00
Sign in to join this conversation.
No reviewers
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Code Documentation
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#115818
No description provided.