IMB: Speedups, fixes and cleanups to various image scaling functions #126390

Merged
Aras Pranckevicius merged 22 commits from aras_p/blender:imb_scaling_cleanup into main 2024-08-19 16:50:50 +02:00

Prompted by comments at #126234

Problem

There are multiple ways to scale images in Blender codebase:

1. IMB_transform is very general (can also do rotations, not only scales), and since 4.1 (#116628) it probably works "correctly". Supports Nearest, Bilinear, Box, Cubic Mitchell and Cubic BSpline filters. Internally multi-threaded, and uses some SIMD. However, destination image must have 4 channels.

2. IMB_scalefastImBuf does Nearest filtering.

  • Single threaded.
  • When enlarging, loses half of outermost pixels. When a 12x6 pixel image is scaled up, it produces this:

    instead of this:

3. IMB_scaleImBuf does Box (scaling down) or Bilinear (scaling up) filtering.

  • Single threaded.
  • When enlarging, loses half of outermost pixels. Similar to above point, it produces this:

    instead of this:
  • Produces garbage for float images with non-4 channels (possibly does out-of-bounds source image read too).

4. IMB_scaleImBuf_threaded does Bilinear filtering.

  • When enlarging, shifts image by half a pixel.
  • When scaling down exactly by 2x2, does no filtering at all. This image:

    when scaled down by 2x2, produces this:

    instead of this:

None of these behavior differences of IMB_scale* functions are documented, of course :)

This PR

  • API: merged IMB_scalefastImBuf, IMB_scaleImBuf, IMB_scaleImBuf_threaded into one function IMB_scale with enum IMBScaleFilter { Nearest, Bilinear, Box } and bool threaded.
  • Performance:
    • Box filtering (née IMB_scaleImBuf) can be multi-threaded now
    • Nearest filtering (née IMB_scalefastImBuf) can be multi-threaded now. Also fixes performance regression
      on float images caused by fix in #126234
    • Bilinear filtering (née IMB_scaleImBuf_threaded) is several times faster now
  • Correctness:
    • Nearest and Box filtering: no longer loses half of edge pixels when scaling up
    • Box: fixed garbage results (and possible out of bounds reads) for non-4 channel float images
    • Bilinear: no longer shifts image when scaling up
    • Bilinear: properly filters when scaling down by 2x2
  • Test coverage:
    • Add gtest coverage for various IMB_scale modes
    • Add a IMB_performance_test performance test, ran manually (similar to existing BLI_map_performance_test)

All of this is with shorter code too! scaling.cc is 890 lines shorter than before. But, I have added 440 lines of new test coverage code.

Performance

Performance test checks this sequence of steps (the dimensions used are quite arbitrary, but picked to not be "neat" on purpose):

  1. Input image is 5123x4091
  2. Scale that up to 6096x8713
  3. Scale that down back to 5123x4091
  4. Scale that down to 1075x2740
  5. Scale that back up to 6096x8713

Performance results on Mac M1 Max (Xcode 15, RelWithDebInfo config), times in ms (smaller is better):

Test IMB_transform ref Before this PR This PR
uchar4 Nearest 108 89
uchar4 Nearest Threaded 83 22
uchar4 Bilinear 923
uchar4 Bilinear Threaded 157 1017 154
uchar4 Box 1445 1431
uchar4 Box Threaded 206 254
float4 Nearest 474 258
float4 Nearest Threaded 133 99
float4 Bilinear 829
float4 Bilinear Threaded 165 943 157
float4 Box 1403 1477
float4 Box Threaded 176 415

Current usages of IMB scale within Blender codebase

Would be nice to review whether all current usages of IMB scaling are using the correct filtering modes and/or threading flags. Here's a list:

  • All of those are single threaded now, should they enable threaded scaling?
    • imbuf_py_api.cc py_imbuf_resize
    • render_preview.cc icon_copy_rect
    • proxy.cc seq_proxy_build_frame
    • image_ops.cc image_scale_exec
    • movieclip.cc get_undistorted_ibuf
    • tracking_util.cc accessor_get_ibuf
    • thumbs.cc thumb_create_ex
    • util_gpu.cc imb_gpu_get_data
    • freestyle Canvas.cpp loadMap
    • interface_region_tooltip.cc ui_tooltip_from_clip
    • wm_splash_screen.cc wm_block_splash_image
    • wm_files.cc blend_file_thumb_from_screenshot (first usage), blend_file_thumb_from_camera (both usages)
    • image.cc BKE_image_scale -- note this is only exposed as RNA function, and the only other place is USDZ export
    • image.cc BKE_image_preview -- the only usage is from within ui_tooltip_from_image
  • Questions on whether current filtering mode is "correct":
    • interface_region_tooltip.cc ui_tooltip_from_clip: uses fill Box filter, perhaps overkill
    • wm_files.cc blend_file_thumb_from_screenshot (second usage): this is always 2x2 downsample, using Box is overkill
    • imbuf_py_api.cc py_imbuf_resize Python argument is named "BILINEAR" but what is actually does is Box
    • movieclip.cc get_undistorted_ibuf: should it actually use Box?

Would be nice to do someday:

  • Do Box filtering in one pass
  • Provide IMB_scale function that does not modify source, but scales into provided destination image. Today quite some usages of the existing function explicitly duplicate some source and call IMB_scale on that. Would save both hassle and memory allocation/copy.
  • Split up whole ImBuf module into smaller parts. Today IMB_performance_test has to link a lot (e.g. it links in USD?!) because ImBuf library itself depends on many many many things. Perhaps a separate library for just image I/O stuff would make sense.
Prompted by comments at #126234 ## Problem There are multiple ways to scale images in Blender codebase: **1.** `IMB_transform` is very general (can also do rotations, not only scales), and since 4.1 (#116628) it _probably_ works "correctly". Supports Nearest, Bilinear, Box, Cubic Mitchell and Cubic BSpline filters. Internally multi-threaded, and uses some SIMD. However, destination image **must have 4 channels**. **2.** `IMB_scalefastImBuf` does Nearest filtering. - Single threaded. - When enlarging, loses half of outermost pixels. When a 12x6 pixel image is scaled up, it produces this: ![](/attachments/77d1076e-8b82-4723-a9cc-a45f1070be73) instead of this: ![](/attachments/324a90a1-8a09-40e0-87a3-e6ccbe3de83c) **3.** `IMB_scaleImBuf` does Box (scaling down) or Bilinear (scaling up) filtering. - Single threaded. - When enlarging, loses half of outermost pixels. Similar to above point, it produces this: ![](/attachments/45ae608b-109f-4960-bbcb-b380adb0c43b) instead of this: ![](/attachments/023ef1eb-6bc7-43ff-b27d-ceaf58d4437f) - Produces garbage for float images with non-4 channels (**possibly does out-of-bounds source image read too**). ![](/attachments/40d70bca-63b3-4d3b-ae7a-103de682e601) **4.** `IMB_scaleImBuf_threaded` does Bilinear filtering. - When enlarging, shifts image by half a pixel. ![](/attachments/995028d3-e5c2-4c7b-b40c-37805692386d) - When scaling down exactly by 2x2, does no filtering at all. This image: ![](/attachments/5258e89f-2f29-4a4f-9a02-9f828ffd8786) when scaled down by 2x2, produces this: ![](/attachments/cea9259d-c495-4d97-a9f2-509a245e0e98) instead of this: ![](/attachments/091a9d48-7e6c-432c-953c-4611bfc1e51e) None of these behavior differences of `IMB_scale*` functions are documented, of course :) ## This PR - API: merged `IMB_scalefastImBuf`, `IMB_scaleImBuf`, `IMB_scaleImBuf_threaded` into one function `IMB_scale` with enum `IMBScaleFilter { Nearest, Bilinear, Box }` and bool `threaded`. - Performance: - Box filtering (née `IMB_scaleImBuf`) can be multi-threaded now - Nearest filtering (née `IMB_scalefastImBuf`) can be multi-threaded now. Also fixes performance regression on float images caused by fix in #126234 - Bilinear filtering (née `IMB_scaleImBuf_threaded`) is several times faster now - Correctness: - Nearest and Box filtering: no longer loses half of edge pixels when scaling up - Box: fixed garbage results (and possible out of bounds reads) for non-4 channel float images - Bilinear: no longer shifts image when scaling up - Bilinear: properly filters when scaling down by 2x2 - Test coverage: - Add gtest coverage for various `IMB_scale` modes - Add a `IMB_performance_test` performance test, ran manually (similar to existing `BLI_map_performance_test`) All of this is with shorter code too! `scaling.cc` is 890 lines shorter than before. But, I have added 440 lines of new test coverage code. ### Performance Performance test checks this sequence of steps (the dimensions used are quite arbitrary, but picked to not be "neat" on purpose): 1. Input image is 5123x4091 2. Scale that up to 6096x8713 3. Scale that down back to 5123x4091 4. Scale that down to 1075x2740 5. Scale that back up to 6096x8713 Performance results on Mac M1 Max (Xcode 15, RelWithDebInfo config), times in ms (smaller is better): | Test | IMB_transform ref | Before this PR | This PR | |-----|-----:|-----:|----:| |uchar4 Nearest | | 108 | 89 | |uchar4 Nearest Threaded | 83 | | 22 | |uchar4 Bilinear | | | 923 | |uchar4 Bilinear Threaded | 157 | 1017 | 154 | |uchar4 Box | | 1445 | 1431 | |uchar4 Box Threaded | 206 | | 254 | |float4 Nearest | | 474 | 258 | |float4 Nearest Threaded | 133 | | 99 | |float4 Bilinear | | | 829 | |float4 Bilinear Threaded | 165 | 943 | 157 | |float4 Box | | 1403 | 1477 | |float4 Box Threaded | 176 | | 415 | ### Current usages of IMB scale within Blender codebase Would be nice to review whether all current usages of IMB scaling are using the correct filtering modes and/or threading flags. Here's a list: - All of those are single threaded now, should they enable threaded scaling? - imbuf_py_api.cc py_imbuf_resize - render_preview.cc icon_copy_rect - proxy.cc seq_proxy_build_frame - image_ops.cc image_scale_exec - movieclip.cc get_undistorted_ibuf - tracking_util.cc accessor_get_ibuf - thumbs.cc thumb_create_ex - util_gpu.cc imb_gpu_get_data - freestyle Canvas.cpp loadMap - interface_region_tooltip.cc ui_tooltip_from_clip - wm_splash_screen.cc wm_block_splash_image - wm_files.cc blend_file_thumb_from_screenshot (first usage), blend_file_thumb_from_camera (both usages) - image.cc BKE_image_scale -- note this is only exposed as RNA function, and the only other place is USDZ export - image.cc BKE_image_preview -- the only usage is from within ui_tooltip_from_image - Questions on whether current filtering mode is "correct": - interface_region_tooltip.cc ui_tooltip_from_clip: uses fill Box filter, perhaps overkill - wm_files.cc blend_file_thumb_from_screenshot (second usage): this is always 2x2 downsample, using Box is overkill - imbuf_py_api.cc py_imbuf_resize Python argument is named "BILINEAR" but what is actually does is Box - movieclip.cc get_undistorted_ibuf: should it actually use Box? ### Would be nice to do someday: - Do Box filtering in one pass - Provide `IMB_scale` function that does not modify source, but scales into provided destination image. Today quite some usages of the existing function explicitly duplicate some source and call `IMB_scale` on that. Would save both hassle and memory allocation/copy. - Split up whole ImBuf module into smaller parts. Today `IMB_performance_test` has to link _a lot_ (e.g. it links in USD?!) because ImBuf library itself depends on _many many many things_. Perhaps a separate library for just image I/O stuff would make sense.
Aras Pranckevicius changed title from WIP: IMB: cleanups and fixes to scaling to WIP: IMB: cleanups and fixes to image scaling 2024-08-18 12:07:44 +02:00
Aras Pranckevicius force-pushed imb_scaling_cleanup from f654cd5575 to 0459c834cf 2024-08-19 11:49:27 +02:00 Compare
Aras Pranckevicius force-pushed imb_scaling_cleanup from 84bc0c9188 to ab255a3c1e 2024-08-19 14:10:21 +02:00 Compare
Author
Member

@blender-bot build

@blender-bot build
Aras Pranckevicius added 1 commit 2024-08-19 14:23:33 +02:00
Aras Pranckevicius changed title from WIP: IMB: cleanups and fixes to image scaling to IMB: Speedups, fixes and cleanups to various image scaling functions 2024-08-19 14:53:17 +02:00
Sergey Sharybin was assigned by Aras Pranckevicius 2024-08-19 14:56:37 +02:00
Sergey Sharybin approved these changes 2024-08-19 15:45:58 +02:00
Aras Pranckevicius merged commit 6d93bf6b44 into main 2024-08-19 16:50:50 +02:00
Aras Pranckevicius deleted branch imb_scaling_cleanup 2024-08-19 16:50:54 +02:00
Sign in to join this conversation.
No reviewers
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Code Documentation
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#126390
No description provided.