ImBuf: optimize IMB_transform #115653

Merged
Aras Pranckevicius merged 15 commits from aras_p/blender:imb_transform_opt into main 2023-12-14 15:10:41 +01:00

15 Commits

Author SHA1 Message Date
Aras Pranckevicius faa6e268f1 Fixing Mac x64 build
buildbot/vexp-code-patch-coordinator Build done. Details
2023-12-02 14:41:09 +02:00
Aras Pranckevicius 0bc8f85648 Fixing Linux build
buildbot/vexp-code-patch-coordinator Build done. Details
2023-12-02 13:07:01 +02:00
Aras Pranckevicius fbd9716206 No need to do floor() hard ways on NEON (or SSE4, which is not on yet but someday might be) 2023-12-02 13:05:12 +02:00
Aras Pranckevicius 8ffbfa061c Cleanup
buildbot/vexp-code-patch-coordinator Build done. Details
2023-12-02 12:46:56 +02:00
Aras Pranckevicius 8c8b4b30b9 Merge remote-tracking branch 'origin/main' into imb_transform_opt 2023-12-02 12:39:56 +02:00
Aras Pranckevicius 6fc295d97f ImBuf: make BLI_bilinear_interpolation_char fully SSE and branchless
VSE, 4K resolution, two transformed image strips with bilinear filter,
Windows Ryzen 5950X: IMB_transform 13.4ms -> 11.2ms
2023-12-02 12:39:31 +02:00
Aras Pranckevicius 608fdcf337 Trying to fix Linux build
buildbot/vexp-code-patch-coordinator Build done. Details
2023-12-01 18:48:35 +02:00
Aras Pranckevicius ce9860df3a Merge branch 'main' into imb_transform_opt
buildbot/vexp-code-patch-coordinator Build done. Details
2023-12-01 17:50:26 +02:00
Aras Pranckevicius 183f585f08 ImBuf: add unit tests for BLI_bilinear_interpolation_char, fix rounding, do not require SSE4 2023-12-01 17:43:43 +02:00
Aras Pranckevicius fe9db4d1e4 Format code
buildbot/vexp-code-patch-coordinator Build done. Details
2023-12-01 13:18:06 +02:00
Aras Pranckevicius 695da0c0f6 ImBuf: use SSE in bilinear_interpolation functions
VSE, 4K resolution, two transformed image strips with bilinear filter,
Windows Ryzen 5950X: IMB_transform 16.9ms -> 13.4ms
2023-12-01 12:17:43 +02:00
Aras Pranckevicius 8e46f1b6b2 ImBuf: don't use virtual calls to do UV wrapping
Inner loop of IMB_transform was using virtual functions to do UV
wrapping. Simplify all of that from 3 classes to one bool.

VSE, 4K resolution, two transformed image strips with bilinear filter,
Windows Ryzen 5950X: IMB_transform 17.3ms -> 16.9ms
2023-12-01 11:28:17 +02:00
Aras Pranckevicius b0ea53dcff ImBuf: speedup bilinear_interpolation functions
Can do with 2 floor calls instead of 4 floor + 2 ceil calls.

VSE, 4K resolution, two transformed image strips with bilinear filter,
Windows Ryzen 5950X: IMB_transform 24.7ms -> 17.3ms
2023-12-01 11:07:33 +02:00
Aras Pranckevicius c3f696e726 ImBuf: simplify bilinear/bicubic interpolation functions to what is actually used
- BLI_bilinear_interpolation_wrap_char is not used at all,
- BLI_bicubic_interpolation_char / BLI_bilinear_interpolation_char
  always uses 4 components
2023-12-01 10:54:47 +02:00
Aras Pranckevicius 1253411a91 ImBuf: don't use virtual calls in inner IMB_transform loop
The CropSource & NoDiscard functors were virtual classes for no good
reason really.

VSE, 4K resolution, two transformed image strips with bilinear filter,
Windows Ryzen 5950X: IMB_transform 26.3ms -> 24.7ms
2023-12-01 10:14:15 +02:00