VSE: Speedup Subsampled 3x3 image filter #117125

Merged

Aras Pranckevicius merged 1 commits from aras_p/blender:vse-subsampling into main

2024-01-17 10:27:01 +01:00

Author	SHA1	Message	Date
Aras Pranckevicius	724f0c2c9e	ImBuf: speed up Subsampled3x3 image filter buildbot/vexp-code-patch-lint Build done. Details buildbot/vexp-code-patch-linux-x86_64 Build done. Details buildbot/vexp-code-patch-darwin-x86_64 Build done. Details buildbot/vexp-code-patch-windows-amd64 Build done. Details buildbot/vexp-code-patch-darwin-arm64 Build done. Details buildbot/vexp-code-patch-coordinator Build done. Details Conceptually Subsampling filter is a box filter: it sums up N source image pixels, computes their average and outputs the result. Critical thing is, that should be done in premultiplied space so that colors from fully or mostly transparent regions do not "override" opaque colors. Previously, especially when operating on byte images, the code achieved this by always working on byte values, doing "progressively smaller" lerps into byte color result, taking care of premultiplication and again storing the "straight" alpha for each sample being processed. This meant that for each sample, there are 3 divisions involved! This also led to some precision loss, since for all 9 samples all the intermediate results would only be stored at byte precision. Reformulate that by simply accumulating the premultiplied color as a float color. This gets rid of all divisions, except the last step when said float needs to be written back into a byte color. Processing destination 4K UHD resolution image with Subsampling 3x3 filter: - Windows/VS2022/Ryzen5950X: 52.7ms -> 28.3ms - Mac/clang15/M1Max: 54.4ms -> 43.7ms The unit test results have a tiny difference, since now it is better (as per above, previously it was having some precision loss).	2024-01-16 22:18:22 +02:00

Author

SHA1

Message

Date

Aras Pranckevicius

724f0c2c9e

ImBuf: speed up Subsampled3x3 image filter

buildbot/vexp-code-patch-lint Build done. Details

buildbot/vexp-code-patch-linux-x86_64 Build done. Details

buildbot/vexp-code-patch-darwin-x86_64 Build done. Details

buildbot/vexp-code-patch-windows-amd64 Build done. Details

buildbot/vexp-code-patch-darwin-arm64 Build done. Details

buildbot/vexp-code-patch-coordinator Build done. Details

Conceptually Subsampling filter is a box filter: it sums up N source
image pixels, computes their average and outputs the result. Critical
thing is, that should be done in premultiplied space so that colors
from fully or mostly transparent regions do not "override" opaque
colors.

Previously, especially when operating on byte images, the code
achieved this by always working on byte values, doing "progressively
smaller" lerps into byte color result, taking care of
premultiplication and again storing the "straight" alpha for each
sample being processed. This meant that for each sample, there are 3
divisions involved! This also led to some precision loss, since for
all 9 samples all the intermediate results would only be stored at
byte precision.

Reformulate that by simply accumulating the premultiplied color
as a float color. This gets rid of all divisions, except the last
step when said float needs to be written back into a byte color.

Processing destination 4K UHD resolution image with Subsampling 3x3
filter:
- Windows/VS2022/Ryzen5950X: 52.7ms -> 28.3ms
- Mac/clang15/M1Max: 54.4ms -> 43.7ms

The unit test results have a tiny difference, since now it is better
(as per above, previously it was having some precision loss).

2024-01-16 22:18:22 +02:00

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

VSE: Speedup Subsampled 3x3 image filter #117125

1 Commits