Speedup the "apply zebra stripes" image loop by multi-threading it.
For non-float images, avoid an extra image copy that was not doing
anything useful.
4K UHD resolution, Windows Ryzen 5950X:
- LDR: whole `sequencer_get_scope` 16.4ms -> 5.3ms, just `draw_zebra`
part: 7.5ms -> 3.3ms
- Float image: whole `sequencer_get_scope` 126.6ms -> 114.1ms, just
`draw_zebra` part: 22.4ms -> 7.4ms. Whole scope is still expensive
due to color management work being done.