VSE: Speedup Subsampled 3x3 image filter #117125

Merged
Aras Pranckevicius merged 1 commits from aras_p/blender:vse-subsampling into main 2024-01-17 10:27:01 +01:00

1 Commits

Author SHA1 Message Date
Aras Pranckevicius 724f0c2c9e ImBuf: speed up Subsampled3x3 image filter
buildbot/vexp-code-patch-lint Build done. Details
buildbot/vexp-code-patch-linux-x86_64 Build done. Details
buildbot/vexp-code-patch-darwin-x86_64 Build done. Details
buildbot/vexp-code-patch-windows-amd64 Build done. Details
buildbot/vexp-code-patch-darwin-arm64 Build done. Details
buildbot/vexp-code-patch-coordinator Build done. Details
Conceptually Subsampling filter is a box filter: it sums up N source
image pixels, computes their average and outputs the result. Critical
thing is, that should be done in premultiplied space so that colors
from fully or mostly transparent regions do not "override" opaque
colors.

Previously, especially when operating on byte images, the code
achieved this by always working on byte values, doing "progressively
smaller" lerps into byte color result, taking care of
premultiplication and again storing the "straight" alpha for each
sample being processed. This meant that for each sample, there are 3
divisions involved! This also led to some precision loss, since for
all 9 samples all the intermediate results would only be stored at
byte precision.

Reformulate that by simply accumulating the premultiplied color
as a float color. This gets rid of all divisions, except the last
step when said float needs to be written back into a byte color.

Processing destination 4K UHD resolution image with Subsampling 3x3
filter:
- Windows/VS2022/Ryzen5950X: 52.7ms -> 28.3ms
- Mac/clang15/M1Max: 54.4ms -> 43.7ms

The unit test results have a tiny difference, since now it is better
(as per above, previously it was having some precision loss).
2024-01-16 22:18:22 +02:00