VSE: speedup timeline drawing, and improve waveform display #115311
No reviewers
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#115311
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "aras_p/blender:vse-draw-opt"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
TL;DR: sequencer timeline UI repainting is 3x-4x faster now, for complex timelines.
VSE timeline rendering performance overview
For large timelines (used Sprite Fright Edit for testing, 2702 strips visible), most of time is spent on drawing quads, submitting draw calls to the GPU, two triangles at a time. That is because the whole code is built in a "draw one strip completely, move to drawing next strip" fashion, and it internally has to change between drawing triangles vs drawing lines vs drawing some striped shader, etc. Additionally, blender's "imm" code does not even allow to efficiently draw more than one quad in one go, due to the fact that each quad is a triangle fan.
There were other bottlenecks too -- drawing audio waveforms, drawing retiming key shapes one key at a time, and drawing thumbnails where GPU textures are being created and destroyed on every draw (this last one is not addressed in this PR).
Approach taken in this PR to improve performance
Performance numbers teaser first! Sprite Fright Edit data set, with whole timeline visible (2702 strips), repainting the timeline UI with all overlay options (waveforms, offsets, thumbnails etc.):
90% of what the timeline strip drawing is, are either quads or quad outlines (4 lines). So build a small utility class
SeqQuadsBatch
where you can keep on doingadd_quad
oradd_wire_quad
, and calldraw
in the end. Internally it batches up to 1024 quads to draw; if you keep on adding more it just internally draws them and proceeds to add from start again.Having that batching utility class, change most of strip drawing code to just use that directly. It has to be broken up in the middle in two places: once for thumbnails (since it's no longer single colored quads, need to change the shader), and once more for locked strips (likewise, different shader to do the striped pattern). So that's overall how the code is structured now, pseudocode:
For retiming keys, change code so that instead of drawing one key at a time (i.e. telling GPU "draw this one point"), it draws many of them in one go ("draw all these points"). It creates imm batches of up to 1024 keys, when they are full it does immEnd and starts a new batch.
For waveform overlay, the changes both speed it up, but also change how it looks like, following design proposal in #115274. Previous code had complicated logic that switched between single-line drawing and triangle-strip drawing, which is not great for trying to submit larger batches to the GPU. The line part often looked weird when zoomed in, and the triangle part was sometimes displaying wrong data magnitude, due to sample interpolation wrongly using -0.5..+0.5 range instead of 0..1 (as a result, it was sometimes showing "clipped" red areas where in fact there aren't any!). Changing all of that to draw squares/quads for waveform samples is both much simpler code wise, and allows batching for the GPU, and displays waveform peaks better, when zoomed out.
What is not done in this PR
Potential future optimizations, since hey maybe let's not try to address everything in one go:
Open Questions
I was surprised that there's no efficient and convenient way to draw many quads at once via
imm
API, actually. A good question is, should something like that be built as part of imm, instead of addingSeqQuadsBatch
inside sequencer? While the helper class is very simple, it is actually not sequencer specific at all.Things like waveform drawing, by rendering each pixel column as a quad, are not "terribly efficient". While way faster than the previous code, an even faster approach would be to put all waveform preview data into a large GPU buffer (SSBO), and draw one quad for each preview waveform overlay, with a special shader that would read from said buffer and render the preview. A similar thing could be done for f-curve previews, again put data into a GPU buffer and have a shader sample it. However, this leads to a somewhat more complicated code (strip preview is no longer in just C++ code, but also GLSL). So not sure if it's worth doing. The current approach "one quad for each pixel column" sounds like a waste, but as long as GPU submission is in large enough batches, it is fast enough.
11b3cf02af
tob485388b57
Thank you for working on this!
The open questions is probably best answered by the viewport module.
@Jeroen-Bakker ^
Seems like expected changes :-) See #114265 for a related task, but then for the outliner. There I also think of drawing layers which collect the information and then send in a few batches to the GPU.
imm is very basic for drawing multiple quads you could check the widget drawing (GPU_SHADER_2D_WIDGET_BASE_INST) , which can do more than only quads. The widget shader is already there, if that isn't enough, we could check for a dedicated shader.
Video previews I am not familiar with But as they are relatively tiny I would expect to do most parts on the CPU for now. Perhaps just cache the result and draw it as a texture. As I am not familiar with the in and outs, you're free to come with your own ideas here as well.
For waveforms I expect an ssbo (one value per pixel in the current zoom level) and a special shader to draw the lines.
Oh, cool! Looking for volunteers to tackle that one? :)
This sounds a bit like overkill. Most of timeline strips are just simple flat-colored quads, but depending on {a lot of factors}, there can be many quads over one strip, all placed slightly differently. And it also might have some single-line outlines, or single-line lines atop/below some parts.
I actually started to experiment with doing things similar to how "a widget" shader is doing, thinking perhaps even going as far as making a complete "timeline strip shader" that would do all the logic in there. But then upon looking further, realized that most of it is just flat-colored rectangles, but with complicated/hairy logic driving when and where they are placed. And so adding a way to "just batch some flat rectangles, and some lines plz" code sounded like the easiest approach. Does that make sense?
The results are cached, but they are cached into CPU-side only images. Which are then, for some reason, turned into GPU textures each and every frame, drawn, and said textures destroyed immediately. But I'd rather keep this part for a separate PR.
I thought doing that initially, but then tried doing a "ehh just draw waveform as a series of quads placed next to each other" (which are already batched due to above), and turns out that made all that code completely disappear from profiling results. So this sounded easier :)
WIP: VSE: speedup timeline drawing, and improve waveform displayto VSE: speedup timeline drawing, and improve waveform displayMakes sense bordered rects with colors batch
Yes a separate PR would be appreciated.
👍
Thanks for patch, with your helper class this is even simpler than I expected :)
So far, I can see some "Z fighting" in selection:
The sound waveform line drawing was implemented by @ZedDB, probably useful for debugging purposes only. Seems, that is is still working, but at certain zoom levels the lines are lost:
I have uploaded wav file for reference.
Other than that, I think that functionally the patch is fine.
Ah, right. Besides making is possible to visualize some more synthectic waveforms like this:
It also solved the issue of indicating when there were no audio data left in the strip (however this is not an issue anymore now that the strip color is different for "out of bounds" data.
I were planning to use that file to do a automated test to check for any audio sync issues. However I didn't have the time to finish that.
I've attached the file in a .zip.
I do think being able to visualize data like that is probably quite useful as it can help the end user notice issues with the audio visially that might be lost if the waveform drawing method can't handle drawing it.
While opening the test file, I noticed a bug in the current master. Perhaps it has already been solved with this PR?
If I drag the start frame of the sound strip the drawn waveform gets offset in the wrong direction. But the audio is not offset either and stays in place. I guess this is deeper than just UI code though because the audio playback doesn't change.
(See attached video)
Good catch, will investigate.
Ah ok. I think I have an idea how to keep the "a line that's not at zero" working from the original code, without alternating between quads and lines that ends up being "not terribly great" for performance :)
In my quick test, that seemed to work fine in 3.6, but is broken in both 4.0 and 4.1, so sounds like an unrelated regression. I can try fixing that in this PR too, but maybe it would be better as a separate commit/PR by someone, for easier backporting to 4.0 if wanted.
Probably better to fix it as a separate commit, yes.
@ -0,0 +101,4 @@
draw();
}
if (quads_num == 0) {
/* Don't actually need this, but it's the only way to clear GPU_VERTBUF_DATA_UPLOADED flag. */
Do you need to clear the flag, because you are reusing the buffer? I would expect that there is some function to handle this.
Here it seems to work when I remove call to
GPU_vertbuf_attr_get_raw_data()
. Would have to look how exactly this flag is used.@ -0,0 +105,4 @@
GPUVertBufRaw attr_data;
GPU_vertbuf_attr_get_raw_data(vbo_quads, 0, &attr_data);
verts_quads = (ColorVertex *)GPU_vertbuf_get_data(vbo_quads);
Use
static_cast
for casting types, see https://wiki.blender.org/wiki/Style_Guide/C_Cpp#C.2B.2B_Type_Cast@ -313,6 +303,12 @@ static void retime_key_draw(const bContext *C, const Sequence *seq, const SeqRet
CLAMP(key_position, left_pos_min, right_pos_max);
const float alpha = SEQ_retiming_data_is_editable(seq) ? 1.0f : 0.3f;
if (*r_point_counter >= MAX_KEYS_IN_BATCH) {
Bit nitpicky, but to me this feels bit out of place here. This could be done in loop in
retime_keys_draw()
. That way you don't have to passpoint_counter
around. You can assume, thatfake_keys_draw()
draws 2 vertices, andretime_key_draw()
draws 1.@ -414,2 +415,4 @@
}
/* Get the selection here once, for faster "is key selected?" lookups later.
This is similar to SEQ_retiming_selection_get, except we want only a set. */
Can't you just pass map returnd by
SEQ_retiming_selection_get()
and usecontains()
method? This seems like unnecessary code duplication.That's what I tried initially, but SEQ_retiming_selection_get returns non-const Sequence pointers as map keys, and my C++-fu is not strong enough to find the magic incantation, how to use
contains()
when I have a const-Sequence pointer.contains_as()
with const pointer does not compile, at least on visual studio.I am not great with C++ either, but you can use non-const
SeqRetimingKey
for iteration.There is also
const_cast
which seems to work, but not sure if it is good idea to use it in this case.@ -637,3 +518,1 @@
float y_mid = (strip_ctx->bottom + strip_ctx->strip_content_top) / 2.0f;
/* The length from the middle of the strip to the top/bottom. */
float y_scale = (strip_ctx->strip_content_top - strip_ctx->bottom) / 2.0f;
/* Do not draw the bars below 2px height. */
I did not try to change this, but I have noticed, that in current main I was able to see faint noise, which can be useful to user.
Keep in mind, that assuming 32px total height, a waveform with "1px amplitude" is only -30dB, which is pretty loud. At 64px total height it would be -36dB, still quite loud. Math can be a bit off, but close enough...
I've re-added "line segments" drawing based on your and Sebastian comments wrt mostly synthetic or low frequency waveforms. Now this code, when below 2px draws a line segment instead, conceptually similar to main.
Maybe I should have mentioned, that the reason why in
draw_seq_strips()
you need ordered set of strips is, priority of selection overlay drawing, so I would look for suspect there. But from looking at code that doesn't seem to be an issue.It was, just pushed a fix (while at it, fixed text overlays being the opposite order - was mentioned on blender chat thread by someone).
I 've got 2 more minor points, otherwise this seems fine for me.
@ -333,0 +315,4 @@
static void draw_continuity(const bContext *C,
const Sequence *seq,
const SeqRetimingKey *key,
const blender::Set<const SeqRetimingKey *> *selection,
Sorry, I have noticed this only now, you should avoid passing pointers to objects, pass reference instead.
So argument would be
const blender::Set<const SeqRetimingKey *> &selection
@ -432,0 +460,4 @@
}
/* If batch is full, start a new one (fake keys add 2 points max). */
if (point_counter + 2 > MAX_KEYS_IN_BATCH) {
I could have specified this a bit more, you can have just 1 condition
if (point_counter + 3 > MAX_KEYS_IN_BATCH) {
afterretime_key_draw()
This way batch should never be overfilled, and from what I read, with
immBeginAtMost()
you can always draw less vertices.@Jeroen-Bakker Do you want to review the code as well? I think, this could be merged after next update, but I can wait for your review.
Thanks for updates, seems fine now.
I looked into the code, but seems all to be isolated in the sequencer code.
I don't see any obvious issue with the code to land. So fine by me to land.
@blender-bot build