VSE: Add anim manager #118670

Open
Richard Antalik wants to merge 52 commits from iss/blender:anim-sharing-2 into main

When changing the target branch, be careful to rebase the branch in your fork to match. See documentation.

When playing back composition in VSE consisting of multiple movie
strips, there is noticable delay when first frame of movie strip is
displayed. This happens, because anim(ImBufAnim) is being loaded.
Anims consume quite a bit of memory - about 100MB per strip, so they are
freed as soon as they are not needed for drawing preview.

This patch implements cache for anims, so that multiple strips can reuse
this resource and anim prefetching, which loads anims in background.
Delays during playback are mainly reduced by prefetching. Cache is used
for resource sharing and minimizing of memory usage (about 5x in test
file based on Gold edit).

There are 2 classes that implements this functionality:
AnimManager is interfacing with VSE code. It provides access to strip
anims. This is done with functions strip_anims_acquire(), which
causes, that anims will be locked, so they have to be unlocked by
strip_anims_release(). It implements prefetching and freeing of
"unused" anims - manage_anims(). And finally it owns the anim cache.

ShareableAnim is wrapper of ImBufAnim. It implements loading and
freeing of anims, on lower level. To facilitate sharing, it does user
counting. It supports multiview setups by storing vector of anims.
These are stored in order of how they are assigned to the strip. So
setups where one of strips which use same input file can be set up as
multiview while another is single view.

DNA Sequence::anims field is marked as deprecated.

On user level, Prefetching does load anims of all strips that are closer
than 512 frames to current frame. Anims of all other strips are freed.
Prefetching and freeing does not happen when user is scrubbing.

When playing back composition in VSE consisting of multiple movie strips, there is noticable delay when first frame of movie strip is displayed. This happens, because anim(`ImBufAnim`) is being loaded. Anims consume quite a bit of memory - about 100MB per strip, so they are freed as soon as they are not needed for drawing preview. This patch implements cache for anims, so that multiple strips can reuse this resource and anim prefetching, which loads anims in background. Delays during playback are mainly reduced by prefetching. Cache is used for resource sharing and minimizing of memory usage (about 5x in test file based on Gold edit). There are 2 classes that implements this functionality: `AnimManager` is interfacing with VSE code. It provides access to strip anims. This is done with functions `strip_anims_acquire()`, which causes, that anims will be locked, so they have to be unlocked by `strip_anims_release()`. It implements prefetching and freeing of "unused" anims - `manage_anims()`. And finally it owns the anim cache. `ShareableAnim` is wrapper of `ImBufAnim`. It implements loading and freeing of anims, on lower level. To facilitate sharing, it does user counting. It supports multiview setups by storing vector of anims. These are stored in order of how they are assigned to the strip. So setups where one of strips which use same input file can be set up as multiview while another is single view. DNA `Sequence::anims` field is marked as deprecated. On user level, Prefetching does load anims of all strips that are closer than 512 frames to current frame. Anims of all other strips are freed. Prefetching and freeing does not happen when user is scrubbing.
Richard Antalik added 4 commits 2024-02-23 16:37:37 +01:00
So far, refactoring has been done.
Ultimately this was due to missing nullptr check before trying to
process the image, but this should have been caught when loading
ImBufAnims. If any is missing, cancel multiview loading and load
movie as if it was single view only.
Richard Antalik added 1 commit 2024-02-23 16:40:44 +01:00
Richard Antalik added 2 commits 2024-02-23 21:49:17 +01:00
Richard Antalik added 1 commit 2024-02-23 22:08:36 +01:00
Richard Antalik added 1 commit 2024-02-24 19:29:20 +01:00
Richard Antalik added 1 commit 2024-02-24 20:07:39 +01:00
Richard Antalik added 1 commit 2024-02-24 20:44:19 +01:00
Richard Antalik changed title from WIP: VSE: share ImBufAnim resource between strips to VSE: share ImBufAnim resource between strips 2024-02-24 20:55:46 +01:00
Richard Antalik added 1 commit 2024-03-12 09:55:56 +01:00
Details are TODO
Richard Antalik added 1 commit 2024-03-21 20:55:22 +01:00
Richard Antalik changed title from VSE: share ImBufAnim resource between strips to [WIP] VSE: share ImBufAnim resource between strips 2024-03-21 20:59:14 +01:00
Richard Antalik added 1 commit 2024-03-21 21:02:23 +01:00
Author
Member

Note to self: The manager has to only prefetch anims. Strips need to acquire anim in main thread, because they are freed by depshraph.

Note to self: The manager has to only prefetch anims. Strips need to acquire anim in main thread, because they are freed by depshraph.
Richard Antalik added 4 commits 2024-05-21 17:32:19 +02:00
Richard Antalik added 1 commit 2024-05-28 22:14:24 +02:00
Richard Antalik added 1 commit 2024-05-28 22:17:24 +02:00
Richard Antalik added 2 commits 2024-05-29 00:04:32 +02:00
Richard Antalik added 1 commit 2024-05-29 01:10:59 +02:00
Richard Antalik added 2 commits 2024-05-30 00:49:24 +02:00
Richard Antalik added 2 commits 2024-06-11 15:30:07 +02:00
Hans Goudey changed title from [WIP] VSE: share ImBufAnim resource between strips to WIP: VSE: share ImBufAnim resource between strips 2024-06-12 14:54:42 +02:00
Richard Antalik added 1 commit 2024-06-12 15:47:46 +02:00
Richard Antalik added 4 commits 2024-07-14 09:26:05 +02:00
Richard Antalik added 5 commits 2024-07-18 09:47:53 +02:00
Richard Antalik added 2 commits 2024-07-18 10:16:02 +02:00
Richard Antalik changed title from WIP: VSE: share ImBufAnim resource between strips to VSE: share ImBufAnim resource between strips 2024-07-18 10:32:22 +02:00
Richard Antalik requested review from Sergey Sharybin 2024-07-18 10:47:03 +02:00
Richard Antalik requested review from Aras Pranckevicius 2024-07-18 10:47:04 +02:00
Richard Antalik added 1 commit 2024-07-18 11:10:04 +02:00
Richard Antalik added 1 commit 2024-07-18 11:49:14 +02:00
Richard Antalik changed title from VSE: share ImBufAnim resource between strips to VSE: Add anim manager 2024-07-18 11:50:47 +02:00
Richard Antalik added 1 commit 2024-07-18 12:37:46 +02:00
Richard Antalik added 1 commit 2024-07-18 12:42:47 +02:00
fix warns
All checks were successful
buildbot/vexp-code-patch-lint Build done.
buildbot/vexp-code-patch-darwin-x86_64 Build done.
buildbot/vexp-code-patch-darwin-arm64 Build done.
buildbot/vexp-code-patch-linux-x86_64 Build done.
buildbot/vexp-code-patch-windows-amd64 Build done.
buildbot/vexp-code-patch-coordinator Build done.
6b8f66a566

Does all of this effectively resolve issue #118155?

Does all of this effectively resolve issue #118155?
Aras Pranckevicius reviewed 2024-07-19 09:57:15 +02:00
@ -31,10 +31,13 @@ struct bSound;
#ifdef __cplusplus
namespace blender::seq {
struct MediaPresence;
struct AnimManager;

Visual Studio emits a bunch of warnings about this, the issue being that there it is struct AnimManager but in the actual file it is class AnimManager

Visual Studio emits a bunch of warnings about this, the issue being that there it is `struct AnimManager` but in the actual file it is `class AnimManager`
Author
Member

Changed this to class. I am not sure if there is any difference in semantic meaning and which one would be more appropriate.

Changed this to class. I am not sure if there is any difference in semantic meaning and which one would be more appropriate.
iss marked this conversation as resolved
Aras Pranckevicius reviewed 2024-07-19 10:12:17 +02:00
@ -0,0 +308,4 @@
for (int i : range) {
Sequence *seq = strips[i];
ShareableAnim &sh_anim = this->cache_entry_get(scene, seq);
sh_anim.mutex->lock();

While trying to test this out, I get a hang here. My test file contained two video tracks on top of each other, both referencing the same file. The hang happens when starting preview playback.

I'm on Windows, what seems to happen is that the ShareableAnim mutex that is tried to get locked here, is actually held by some thread that is already gone. I.e. some thread was apparently spawned, locked the mutex and finished without unlocking it.

While trying to test this out, I get a hang here. My test file contained two video tracks on top of each other, both referencing the same file. The hang happens when starting preview playback. I'm on Windows, what seems to happen is that the ShareableAnim mutex that is tried to get locked here, is actually held by some thread that is already gone. I.e. some thread was apparently spawned, locked the mutex and finished without unlocking it.
Author
Member

remove_duplicates_for_parallel_load() is supposed to prevent this. Previously this was handled by try_lock method, but that is not great.

I am not able to reproduce this issue. Can you share .blend file?

`remove_duplicates_for_parallel_load()` is supposed to prevent this. Previously this was handled by `try_lock` method, but that is not great. I am not able to reproduce this issue. Can you share .blend file?

For me this happens in "scene3" of this file, does not happen in "scene2" or "scene1" of the same file. What is different in scene3, is that the same source video file is used multiple times at once (on two different channels). Not sure if related, but maybe something like that triggers the deadlock. I only tested on windows so far though.

For me this happens in "scene3" of this file, does not happen in "scene2" or "scene1" of the same file. What is different in scene3, is that the same source video file is used multiple times at once (on two different channels). Not sure if related, but maybe something like that triggers the deadlock. I only tested on windows so far though.

My guess for what happens is:

  1. parallel_load_anims is called with two strips that are visible at once, but they share the same anim object
  2. they get put onto two different worker threads by threading::parallel_for inside of parallel_load_anims
  3. one of them manages to do sh_anim.mutex.lock()
  4. the other can not, since it is locked by the other worker thread!
  5. so it sits there indefinitely, waiting for the mutex

My theory could just as well be wrong of course.

My guess for what happens is: 1. `parallel_load_anims` is called with two strips that are visible at once, but they share the same anim object 2. they get put onto two different worker threads by `threading::parallel_for` inside of `parallel_load_anims` 3. one of them manages to do `sh_anim.mutex.lock()` 4. the other can not, since it is locked by the other worker thread! 5. so it sits there indefinitely, waiting for the mutex My theory could just as well be wrong of course.
Author
Member

I can not reproduce this unfortunately. Stepping through the code, the strips are filtered properly, so they are unique. So step 2 should not happen.

I can double check on Windows, but platform should not matter.

I can not reproduce this unfortunately. Stepping through the code, the strips are filtered properly, so they are unique. So step 2 should not happen. I can double check on Windows, but platform should not matter.
Author
Member

Good news - I was able to repro on windows. Will update build environment, so will debug this.

Good news - I was able to repro on windows. Will update build environment, so will debug this.
Author
Member

This was case of double unlocking actually, which should be fixed. But it still somehow crashed with MSVC on unlocking, while mutex was locked. Will try to test release build and make fresh debug build just in case as it does not make sense to me. Will keep this thread open in meanwhile.

I would expect hang/error on Linux as well, but maybe it's not that strict.

This was case of double unlocking actually, which should be fixed. But it still somehow crashed with MSVC on unlocking, while mutex was locked. Will try to test release build and make fresh debug build just in case as it does not make sense to me. Will keep this thread open in meanwhile. I would expect hang/error on Linux as well, but maybe it's not that strict.
Author
Member

Ok, so I did bit of investigation, and this happens, because thread initializes and locks the data, with intention that this lock is held until main thread does rendering (preview). Main thread would then release the lock. However, according to std::mutex reference, this may result in undefined behavior.
To resolve this issue, AnimManager::freeing_mutex was added, which gets locked by main thread when prefetching or freeing data. While this lock is held, ShareableAnim::mutex is locked by main thread.

Technically, this should not be necessary, since prefetch always runs before any data is freed. But I would like to have guarantee, that data won't be freed, by some IO delays or what not.
It's weird, that this use case is not supported in nicer way...

Ok, so I did bit of investigation, and this happens, because thread initializes and locks the data, with intention that this lock is held until main thread does rendering (preview). Main thread would then release the lock. However, according to `std::mutex` reference, this may result in undefined behavior. To resolve this issue, `AnimManager::freeing_mutex` was added, which gets locked by main thread when prefetching or freeing data. While this lock is held, `ShareableAnim::mutex` is locked by main thread. Technically, this should not be necessary, since prefetch always runs before any data is freed. But I would like to have guarantee, that data won't be freed, by some IO delays or what not. It's weird, that this use case is not supported in nicer way...
iss marked this conversation as resolved
Aras Pranckevicius reviewed 2024-07-19 10:18:02 +02:00
@ -0,0 +1,74 @@
/* SPDX-FileCopyrightText: 2024 Blender Authors

Not related to this particular place, just wanted to raise as an issue: I consistenly get a crash when trying to drag and drop any video file into the VSE timeline, in a default empty video project.

What seems to happen is some sort of memory corruption when cleaning up some thread (not sure which one, as it already finished executing any user code and is about to wrap up at CRT/OS level). Another Blender related thread is doing sequencer_drag_drop.cc prefetch_data_fn, this bit:

    g_drop_coords.strip_len = IMB_anim_get_duration(anim, IMB_TC_NONE);
    short frs_sec;
    float frs_sec_base;
    if (IMB_anim_get_fps(anim, true, &frs_sec, &frs_sec_base)) {
      g_drop_coords.playback_rate = float(frs_sec) / frs_sec_base;
    }
    else {
      g_drop_coords.playback_rate = 0;
    }
    IMB_free_anim(anim); // <--- here
Not related to this particular place, just wanted to raise as an issue: I consistenly get a crash when trying to drag and drop any video file into the VSE timeline, in a default empty video project. What seems to happen is some sort of memory corruption when cleaning up _some_ thread (not sure which one, as it already finished executing any user code and is about to wrap up at CRT/OS level). Another Blender related thread is doing `sequencer_drag_drop.cc` `prefetch_data_fn`, this bit: ```Cpp g_drop_coords.strip_len = IMB_anim_get_duration(anim, IMB_TC_NONE); short frs_sec; float frs_sec_base; if (IMB_anim_get_fps(anim, true, &frs_sec, &frs_sec_base)) { g_drop_coords.playback_rate = float(frs_sec) / frs_sec_base; } else { g_drop_coords.playback_rate = 0; } IMB_free_anim(anim); // <--- here ```
Author
Member

Looking at the code, this should be unrelated to this PR. Unfortunately, I am not able to reproduce this issue.

Technically this could or maybe should have been handled by anim manager as well.

Looking at the code, this should be unrelated to this PR. Unfortunately, I am not able to reproduce this issue. Technically this could or maybe should have been handled by anim manager as well.

Looking at the code, this should be unrelated to this PR

Well I dunno. It consistently happens for me in this PR, and consistently does not happen without this PR. I'm testing on windows btw, did not test on some other OS yet.

> Looking at the code, this should be unrelated to this PR Well I dunno. It consistently happens for me in this PR, and consistently *does not happen* without this PR. I'm testing on windows btw, did not test on some other OS yet.

Ah wait no, this also happens on main, but only with some specific file that I have. I was stupidly testing main with some different file, but this branch with that problematic file. Sorry! Move on, nothing to see here! (I'll try to investigate why this happens on main)

Ah wait no, this also happens on main, but only with some specific file that I have. I was stupidly testing main with some different file, but this branch with that problematic file. Sorry! Move on, nothing to see here! (I'll try to investigate why this happens on main)
aras_p marked this conversation as resolved
Aras Pranckevicius reviewed 2024-07-19 12:03:23 +02:00
@ -0,0 +21,4 @@
public:
blender::Vector<ImBufAnim *> anims; /* Ordered by view_id. */
blender::Set<Sequence *> users;
std::unique_ptr<std::mutex> mutex = std::make_unique<std::mutex>();

Wondering why this mutex is std::unique_ptr<std::mutex> and not just a mutex member, i.e. std::mutex?

Wondering why this mutex is `std::unique_ptr<std::mutex>` and not just a mutex member, i.e. `std::mutex`?
Author
Member

Probably I have tried to store this in some container which required class to be copyable / movable

Probably I have tried to store this in some container which required class to be copyable / movable
iss marked this conversation as resolved
Author
Member

Does all of this effectively resolve issue #118155?

As far as you don't overwhelm prefetching thread, then yes. 100% solution would be to load all data in advance and never free them. So technically this is compromise, that should cover reasonable scenarios.

Thanks for review, I see there are issues still, will have to look at these after next week, as next week I will have time off.

> Does all of this effectively resolve issue #118155? As far as you don't overwhelm prefetching thread, then yes. 100% solution would be to load all data in advance and never free them. So technically this is compromise, that should cover reasonable scenarios. Thanks for review, I see there are issues still, will have to look at these after next week, as next week I will have time off.
Sergey Sharybin requested review from Francesco Siddi 2024-07-19 14:28:42 +02:00

@blender-bot package

@blender-bot package
Member

Package build started. Download here when ready.

Package build started. [Download here](https://builder.blender.org/download/patch/PR118670) when ready.

I don't see a macOS build. Should it build again?

I don't see a macOS build. Should it build again?

@blender-bot package

@blender-bot package
Member

Package build started. Download here when ready.

Package build started. [Download here](https://builder.blender.org/download/patch/PR118670) when ready.
Iliya Katushenock reviewed 2024-07-22 19:57:57 +02:00
@ -0,0 +24,4 @@
std::unique_ptr<std::mutex> mutex = std::make_unique<std::mutex>();
void release_from_strip(Sequence *seq);
void release_from_all_strips(void);

release_from_all_strips(void) -> release_from_all_strips()

`release_from_all_strips(void)` -> `release_from_all_strips()`
iss marked this conversation as resolved
@ -0,0 +42,4 @@
/**
* Load anims used by strips and lock them so they won't be freed.
*/
void strip_anims_acquire(const Scene *scene, blender::Vector<Sequence *> strips);

blender::Vector<Sequence *> &r_strips

`blender::Vector<Sequence *> &r_strips`
Author
Member

strip_anims_acquire() does need to modify strips, but it must do it locally. That is why argument value is passed here, and not reference.

`strip_anims_acquire()` does need to modify `strips`, but it must do it locally. That is why argument value is passed here, and not reference.
mod_moder marked this conversation as resolved
@ -0,0 +52,4 @@
/**
* Get anims used by `seq`.
*/
blender::Vector<ImBufAnim *> &strip_anims_get(const Scene *scene, const Sequence *seq);

Result should be by value.

Result should be by value.
iss marked this conversation as resolved
@ -0,0 +173,4 @@
}
if (is_multiview(scene, seq)) {
blender::Vector<ImBufAnim *> new_anims = multiview_anims_get(

Should be const

Should be const
iss marked this conversation as resolved
@ -0,0 +187,4 @@
}
}
for (int i = 0; i < this->anims.size(); i++) {

for (const int i : this->anims.index_range())

`for (const int i : this->anims.index_range())`
iss marked this conversation as resolved
@ -0,0 +222,4 @@
return false;
});
blender::Vector<Sequence *> strips_sorted = strips.as_span();

strips_sorted = strips;

`strips_sorted = strips;`
Author
Member

Can't "cast" VectorSet to Vector.

Can't "cast" `VectorSet` to `Vector`.
mod_moder marked this conversation as resolved
@ -0,0 +305,4 @@
strips = remove_duplicates_for_parallel_load(scene, strips);
threading::parallel_for(strips.index_range(), 1, [&](const IndexRange range) {
for (int i : range) {

const int i : range

`const int i : range`
iss marked this conversation as resolved
@ -0,0 +310,4 @@
ShareableAnim &sh_anim = this->cache_entry_get(scene, seq);
sh_anim.mutex->lock();
sh_anim.acquire_anims(scene, seq);

Just for record, its much better to just use EnumerableThreadSpecific to accumulate elements from different threads and in result just move all of them into class fields instead of have mutex just for that.

Just for record, its much better to just use `EnumerableThreadSpecific` to accumulate elements from different threads and in result just move all of them into class fields instead of have mutex just for that.
Author
Member

I wasn't familiar with this concept. But if I understand it correctly (I would use EnumerableThreadSpecific<Sequence *>) it would not eliminate need for mutex usage. This is, because ShareableAnim is shared between "instances" of Sequence *
Also ShareableAnim::release_from_strip() may be called at any time.

I wasn't familiar with this concept. But if I understand it correctly (I would use `EnumerableThreadSpecific<Sequence *>`) it would not eliminate need for mutex usage. This is, because `ShareableAnim` is shared between "instances" of `Sequence *` Also `ShareableAnim::release_from_strip()` may be called at any time.
@ -372,0 +378,4 @@
AnimManager *manager = seq_anim_manager_ensure(SEQ_editing_get(scene));
manager->strip_anims_acquire(scene, seq);
blender::Vector<ImBufAnim *> anims = manager->strip_anims_get(scene, seq);
int count = anims.size();

const int count

`const int count`
iss marked this conversation as resolved
@ -1239,1 +1248,3 @@
BLI_listbase_count_at_most(&seq->anims, totfiles + 1) == totfiles;
std::min(anim_count, totfiles + 1) == totfiles;
if (anims.size() == 0) {

is_empty()

`is_empty()`
iss marked this conversation as resolved
Richard Antalik added 2 commits 2024-08-03 06:38:20 +02:00
Address inlines
Some checks failed
buildbot/vexp-code-patch-lint Build done.
buildbot/vexp-code-patch-darwin-x86_64 Build done.
buildbot/vexp-code-patch-darwin-arm64 Build done.
buildbot/vexp-code-patch-linux-x86_64 Build done.
buildbot/vexp-code-patch-windows-amd64 Build done.
buildbot/vexp-code-patch-coordinator Build done.
d8f70df88e

@blender-bot package

@blender-bot package
Member

Package build started. Download here when ready.

Package build started. [Download here](https://builder.blender.org/download/patch/PR118670) when ready.

Not sure if I'm missing something, but when adding video strips to the timeline they do not display in the viewer.

Not sure if I'm missing something, but when adding video strips to the timeline they do not display in the viewer.
Author
Member

Not sure if I'm missing something, but when adding video strips to the timeline they do not display in the viewer.

Hmm I guess I did not test my changes :( But I have a clue which one caused this.
Edit: nope, turns out it was me being stupid and writing if (anims.is_empty() == 0)

> Not sure if I'm missing something, but when adding video strips to the timeline they do not display in the viewer. Hmm I guess I did not test my changes :( But I have a clue which one caused this. Edit: nope, turns out it was me being stupid and writing `if (anims.is_empty() == 0)`
Richard Antalik added 2 commits 2024-08-05 06:59:18 +02:00
Fix incorrect condition
All checks were successful
buildbot/vexp-code-patch-lint Build done.
buildbot/vexp-code-patch-darwin-x86_64 Build done.
buildbot/vexp-code-patch-darwin-arm64 Build done.
buildbot/vexp-code-patch-linux-x86_64 Build done.
buildbot/vexp-code-patch-windows-amd64 Build done.
buildbot/vexp-code-patch-coordinator Build done.
4a2aa06665
Author
Member

@blender-bot package

@blender-bot package
Member

Package build started. Download here when ready.

Package build started. [Download here](https://builder.blender.org/download/patch/PR118670) when ready.

While testing i got a couple of crashes, which I think were unrelated (and I can't repro). Overall seems to work better than before.

While testing i got a couple of crashes, which I think were unrelated (and I can't repro). Overall seems to work better than before.
Aras Pranckevicius reviewed 2024-08-06 15:08:52 +02:00
@ -0,0 +334,4 @@
this->prefetch_thread.join();
}
else {
this->prefetch_thread = std::thread(&AnimManager::free_unused_and_prefetch_anims, this, scene);

This ends up creating and shutting down a whole OS thread, basically for each sequencer frame that is being processed. Maybe that is not a big problem, but feels a bit like a waste. Wouldn't it be better to have one "prefetch thread" that does all this "free unused and prefetch anims", instead of spawning a new one many times per second?

This ends up creating and shutting down a whole OS thread, basically for each sequencer frame that is being processed. Maybe that is not a big problem, but feels a bit like a waste. Wouldn't it be better to have one "prefetch thread" that does all this "free unused and prefetch anims", instead of spawning a new one many times per second?
Author
Member

Hmmmm I guess I could make a persistent thread that would be owned by class. Not sure how this is done in cpp, but I hope I don't need to setup signalling for this.

Hmmmm I guess I could make a persistent thread that would be owned by class. Not sure how this is done in cpp, but I hope I don't need to setup signalling for this.
Richard Antalik added 1 commit 2024-08-08 18:59:58 +02:00
Richard Antalik added 1 commit 2024-08-12 19:17:14 +02:00
Aras Pranckevicius reviewed 2024-08-26 19:38:53 +02:00
@ -0,0 +298,4 @@
for (const int i : range) {
Sequence *seq = strips[i];
ShareableAnim &sh_anim = this->cache_entry_get(scene, seq);
sh_anim.mutex.lock();

With latest commit on this PR, I am getting a C++ exception "resource deadlock would occur" (seemingly the mutex is already locked by the same thread? but not 100% sure). This is on Windows, on Gold (gold-edit-v804 - I forget where I got it from, Francesco or Sergey gave it to me, entering frame 2875).

What happens on that frame is: there are two strips, each refering to the same movie file, and each with a different Speed effect. Then there's another movie strip starting on top of all of them.

image
With latest commit on this PR, I am getting a C++ exception "resource deadlock would occur" (seemingly the mutex is already locked by the same thread? but not 100% sure). This is on Windows, on Gold (gold-edit-v804 - I forget where I got it from, Francesco or Sergey gave it to me, entering frame 2875). What happens on that frame is: there are two strips, each refering to the same movie file, and each with a different Speed effect. Then there's another movie strip starting on top of all of them. <img width="450" alt="image" src="attachments/54274426-7c78-4d94-9f0b-ec5975459344">
Author
Member

Well at least I should have these files, so I should be able to simulate this case. Will have to copy files to SSD to check on Windows. I am too lazy to setup file sharing server.

Well at least I should have these files, so I should be able to simulate this case. Will have to copy files to SSD to check on Windows. I am too lazy to setup file sharing server.

While I was not able to test performance improvement on my Gold edit (see comment about C++ exception above), on another file that I have with several video tracks each cut into many pieces (refering to the same file), I see performance degradation compared to main.

Playback preview on main (see cache visualization - gaps is where the playback "cannot keep up"):
image

whereas in this PR, while the framerate is much more stable, it is also much lower:
image

Render time has similarly regressed. Rendering the same sequence on main: 42sec, on this PR: 71sec. I have not exactly investigated why.

@iss do you have some performance numbers from your side?

While I was not able to test performance improvement on my Gold edit (see comment about C++ exception above), on another file that I have with several video tracks each cut into many pieces (refering to the same file), I see performance degradation compared to main. Playback preview on main (see cache visualization - gaps is where the playback "cannot keep up"): <img width="1252" alt="image" src="attachments/84bf7a86-2a18-4e61-a464-54d88ad77973"> whereas in this PR, while the framerate is much more stable, it is also _much lower_: <img width="1248" alt="image" src="attachments/e1360659-8722-4467-ab7b-41878b9dd2a3"> Render time has similarly regressed. Rendering the same sequence on main: 42sec, on this PR: 71sec. I have not exactly investigated why. @iss do you have some performance numbers from your side?
Author
Member

While I was not able to test performance improvement on my Gold edit (see comment about C++ exception above), on another file that I have with several video tracks each cut into many pieces (refering to the same file), I see performance degradation compared to main.

Playback preview on main (see cache visualization - gaps is where the playback "cannot keep up"):
image

whereas in this PR, while the framerate is much more stable, it is also much lower:
image

Render time has similarly regressed. Rendering the same sequence on main: 42sec, on this PR: 71sec. I have not exactly investigated why.

@iss do you have some performance numbers from your side?

I can reproduce very very slightly worse playback rate with this PR, but either your gold-edit-v804 is different than mine or your machine is worse than mine (AMD 5950x CPU), because I can play that portion of timeline with nearly 0 frame drops.

This PR does not improve perfrormance. It only resolves hickup when decoding first frame of next strip. This is best tested on movies that play well normally. For example here is same movie with frequent cuts:

Or more normal scenario with different files:

I have noticed, that there are still some hiccups, not sure where these are comming from. They are much less noticable with this PR. I normally test performance with cache disabled, since I don't need to worry about clearing it. Will look into that.

> While I was not able to test performance improvement on my Gold edit (see comment about C++ exception above), on another file that I have with several video tracks each cut into many pieces (refering to the same file), I see performance degradation compared to main. > > Playback preview on main (see cache visualization - gaps is where the playback "cannot keep up"): > <img width="1252" alt="image" src="attachments/84bf7a86-2a18-4e61-a464-54d88ad77973"> > > whereas in this PR, while the framerate is much more stable, it is also _much lower_: > <img width="1248" alt="image" src="attachments/e1360659-8722-4467-ab7b-41878b9dd2a3"> > > Render time has similarly regressed. Rendering the same sequence on main: 42sec, on this PR: 71sec. I have not exactly investigated why. > > @iss do you have some performance numbers from your side? I can reproduce very very slightly worse playback rate with this PR, but either your gold-edit-v804 is different than mine or your machine is worse than mine (AMD 5950x CPU), because I can play that portion of timeline with nearly 0 frame drops. This PR does not improve perfrormance. It only resolves hickup when decoding first frame of next strip. This is best tested on movies that play well normally. For example here is same movie with frequent cuts: <video src="/attachments/587f1f7b-7ca9-4e02-9e00-f611a0c22dc1" title="2024-08-27 00-32-14.mp4" controls></video> Or more normal scenario with different files: <video src="/attachments/dc83470d-5c8f-4fef-8a3f-ccea3d010f29" title="2024-08-27 00-42-05.mp4" controls></video> I have noticed, that there are still some hiccups, not sure where these are comming from. They are much less noticable with this PR. I normally test performance with cache disabled, since I don't need to worry about clearing it. Will look into that.

I can reproduce very very slightly worse playback rate with this PR, but either your gold-edit-v804 is different than
mine or your machine is worse than mine (AMD 5950x CPU), because I can play that portion of timeline with nearly 0 frame drops.

My screenshots of playback perf were not from Gold -- on Gold I get a crash/deadlock/exception; the screenshots were from another file. Still trying to make it small enough to be shared so that I can show the performance issue.

However, I am able to reproduce the crash/exception similar to what I get on Gold, with a smaller test case (again, this is Windows, VS2022). In this attached file, switch to scC-speed-fx scene, press play. For me it crashes as soon as the strips start (frame 23). There are two strips, each with Speed effect, and each refer to the same source file. And then there's another strip on top, using a different file. "resource deadlock would occur" C++ exception is thrown from anim_manager.cc line 301.

Update: found the cause! See comments added in anim_filepath_get and strip_anims_release

However, with the crash/exception locally fixed, on scC-speed-fx scene in this attached file I'm getting way more playback hiccups compared to main branch. Main branch (see how many frames got rendered - there are some gaps, so it is not ideal):

However in this PR even fewer frames are rendered:

> I can reproduce very very slightly worse playback rate with this PR, but either your gold-edit-v804 is different than > mine or your machine is worse than mine (AMD 5950x CPU), because I can play that portion of timeline with nearly 0 frame drops. My screenshots of playback perf were *not* from Gold -- on Gold I get a crash/deadlock/exception; the screenshots were from another file. Still trying to make it small enough to be shared so that I can show the performance issue. ~~However, I am able to reproduce the crash/exception similar to what I get on Gold, with a smaller test case (again, this is Windows, VS2022). In this attached file, switch to `scC-speed-fx` scene, press play. For me it crashes as soon as the strips start (frame 23). There are two strips, each with Speed effect, and each refer to the same source file. And then there's another strip on top, using a different file. "resource deadlock would occur" C++ exception is thrown from `anim_manager.cc` line 301.~~ **Update**: found the cause! See comments added in `anim_filepath_get` and `strip_anims_release` However, with the crash/exception locally fixed, on `scC-speed-fx` scene in this attached file I'm getting way more playback hiccups compared to main branch. Main branch (see how many frames got rendered - there are some gaps, so it is not ideal): ![](/attachments/db3ce408-9b98-4536-8d42-1d1668c8bca7) However in this PR even fewer frames are rendered: ![](/attachments/0b5bff64-4b5d-4b29-b633-ad450dffc16d)
Aras Pranckevicius reviewed 2024-08-27 19:35:31 +02:00
@ -0,0 +43,4 @@
char *r_filepath)
{
if (seq->strip == nullptr || seq->strip->stripdata == nullptr) {
return;

This will leave result filepath uninitialized (containing garbage or previous data) in this case, which was part of the reason why I was getting deadlock/exception in my tests. Suggest adding r_filepath[0] = 0; before return.

This will leave result filepath uninitialized (containing garbage or _previous_ data) in this case, which was part of the reason why I was getting deadlock/exception in my tests. Suggest adding `r_filepath[0] = 0;` before return.
iss marked this conversation as resolved
Aras Pranckevicius reviewed 2024-08-27 19:37:10 +02:00
@ -0,0 +363,4 @@
void AnimManager::strip_anims_release(const Scene *scene, blender::Vector<Sequence *> strips)
{
strips = remove_duplicates_for_parallel_load(scene, strips);

This uses different logic compared to strip_anims_acquire, which was part of the reason why I was getting deadlocks/exceptions in my tests: acquire filters for "movies only" and then for duplicates, whereas this one only filters for duplicates.

Which does mean that e.g. when you have movies with Speed effects on top, they are not detected as needing acquire/release. Not sure if that is a problem or not.

This uses different logic compared to `strip_anims_acquire`, which was part of the reason why I was getting deadlocks/exceptions in my tests: acquire filters for "movies only" and then for duplicates, whereas this one only filters for duplicates. Which does mean that e.g. when you have movies with Speed effects on top, they are not detected as needing acquire/release. Not sure if that is a problem or not.
Aras Pranckevicius reviewed 2024-08-27 19:38:36 +02:00
@ -342,3 +345,4 @@
manager->strip_anims_release(scene, seq);
return float(frs_sec) / frs_sec_base;
}
break;

Isn't this code path (right before break) in SEQ_time_sequence_get_fps potentially missing a strip_anims_release call?

Isn't this code path (right before break) in `SEQ_time_sequence_get_fps` potentially missing a `strip_anims_release` call?
Author
Member

Yep :(
This is so bad indeed...

Yep :( This is so bad indeed...
iss marked this conversation as resolved
Aras Pranckevicius reviewed 2024-08-27 19:40:27 +02:00
@ -613,3 +523,1 @@
sanim = static_cast<StripAnim *>(seq->anims.first);
if ((!sanim) || (!sanim->anim)) {
if (anims.size() == 0) {

If anims size is zero, return here in SEQ_add_reload_new_file will leave anims locked? i.e. missing strip_anims_release

If anims size is zero, return here in `SEQ_add_reload_new_file` will leave anims locked? i.e. missing `strip_anims_release`
Author
Member

It will indeed.

It will indeed.
iss marked this conversation as resolved
Aras Pranckevicius reviewed 2024-08-27 19:47:33 +02:00
@ -0,0 +20,4 @@
class ShareableAnim {
public:
blender::Vector<ImBufAnim *> anims; /* Ordered by view_id. */
blender::Set<Sequence *> users;

Overall it "feels like" usage of the anim manager needs a lot of (cumbersome, error prone) acquiring and releasing.

I'm wondering if all of that is strictly needed, and like wouldn't it be better to remove all the "users" tracking, and instead replace with "last needed timestamp". Something along the lines:

  • Every "tick/update" of the anim manager, increment a 64 bit integer, which is "logical time".
  • In ShareableAnim, have 64 bit integer for "last used time".
  • Whenever a ShareableAnim is created, used or updated in any way, set its last used time to the current time from anim manager.
  • free_unused_anims (more like "free old anims") then looks at anims where their last used time is older than N ticks/updates.

I think this would make it so that all of these acquire/release calls (and mutex in ShareableAnim) would not be actually needed? Or is there some other reason for having them?

Overall it "feels like" usage of the anim manager needs a lot of (cumbersome, error prone) acquiring and releasing. I'm wondering if all of that is strictly needed, and like wouldn't it be better to remove all the "users" tracking, and instead replace with "last needed timestamp". Something along the lines: - Every "tick/update" of the anim manager, increment a 64 bit integer, which is "logical time". - In ShareableAnim, have 64 bit integer for "last used time". - Whenever a ShareableAnim is created, used or updated in any way, set its last used time to the current time from anim manager. - `free_unused_anims` (more like "free old anims") then looks at anims where their last used time is older than N ticks/updates. I _think_ this would make it so that all of these acquire/release calls (and mutex in ShareableAnim) would not be actually needed? Or is there some other reason for having them?
Author
Member

You still need mutex, because freeing happens in thread. There is no guarantee, that AnimManager::prefetch_thread would be done before you start rendering.

The scheme is, that SEQ_render_give_ibuf() will request anims to be loaded before rendering, lock them so nothing can touch these, render whole frame and finally release anims, so they can be freed.

If freeing could be done in main thread, it would be much better. I did this in thread, because I think, that you have pointed out, that it takes quite a bit of time. I did not measure it myself though. I think I will do it and consider if it's worth having it in main thread.

Technically, you could use "free oldest anim" used scheme. Currently it frees all anims, that are not to be prefetched. The function is 17 lines long, so I don't feel much urge to change it.

As for usercounting, some areas of code do still use SEQ_relations_sequence_free_anim() for whatever reason. while this is used, I need to keep track of strips that are in use of anim. In theory, you can just load anim and forget about it, but I did not try to change this intentionally. If load and forget scheme was used, usercounting could be dropped, and my intention is to do it.
But this isn't obvious source of potential errors or adding too much complexity, so I did not want to do it in this PR.

You still need mutex, because freeing happens in thread. There is no guarantee, that `AnimManager::prefetch_thread` would be done before you start rendering. The scheme is, that `SEQ_render_give_ibuf()` will request anims to be loaded before rendering, lock them so nothing can touch these, render whole frame and finally release anims, so they can be freed. If freeing could be done in main thread, it would be much better. I did this in thread, because I think, that you have pointed out, that it takes quite a bit of time. I did not measure it myself though. I think I will do it and consider if it's worth having it in main thread. Technically, you could use "free oldest anim" used scheme. Currently it frees all anims, that are not to be prefetched. The function is 17 lines long, so I don't feel much urge to change it. As for usercounting, some areas of code do still use `SEQ_relations_sequence_free_anim()` for whatever reason. while this is used, I need to keep track of strips that are in use of anim. In theory, you can just load anim and forget about it, but I did not try to change this intentionally. If load and forget scheme was used, usercounting could be dropped, and my intention is to do it. But this isn't obvious source of potential errors or adding too much complexity, so I did not want to do it in this PR.
Aras Pranckevicius requested changes 2024-08-27 19:48:31 +02:00
Aras Pranckevicius left a comment
Member

Added comments - some about overall design, some suggestions how to fix the actual deadlock/crash/exception that I'm seeing in my tests on Windows

Added comments - some about overall design, some suggestions how to fix the actual deadlock/crash/exception that I'm seeing in my tests on Windows
Richard Antalik added 1 commit 2024-09-02 09:00:45 +02:00
Richard Antalik added 1 commit 2024-09-02 09:51:53 +02:00
Richard Antalik added 1 commit 2024-09-02 10:18:32 +02:00
Author
Member

I can reproduce very very slightly worse playback rate with this PR, but either your gold-edit-v804 is different than
mine or your machine is worse than mine (AMD 5950x CPU), because I can play that portion of timeline with nearly 0 frame drops.

I did "quick" investigation, turns out, that when I comment out anim_manager->manage_anims(), the performance is good. However this function takes about 5us to execute. And AnimManager::free_unused_and_prefetch_anims() takes "only" 0.3ms. There is thread creation inbetween these functions, but that takes "only" 0.4ms. This is quite bizzare. I think, that I need flame graph.

Edit: I was so blind, that I did not notice, that when I comment out anim_manager->manage_anims() the movie for strips with speed effect was not loaded. So disregard this comment

> > I can reproduce very very slightly worse playback rate with this PR, but either your gold-edit-v804 is different than > > mine or your machine is worse than mine (AMD 5950x CPU), because I can play that portion of timeline with nearly 0 frame drops. I did "quick" investigation, turns out, that when I comment out `anim_manager->manage_anims()`, the performance is good. However this function takes about 5us to execute. And `AnimManager::free_unused_and_prefetch_anims()` takes "only" 0.3ms. There is thread creation inbetween these functions, but that takes "only" 0.4ms. This is quite bizzare. I think, that I need flame graph. Edit: I was so blind, that I did not notice, that when I comment out `anim_manager->manage_anims()` the movie for strips with speed effect was not loaded. So disregard this comment
Author
Member

Ok, I have found the issue: Both strips with speed effect share 1 anim, but they need to seek to different position. Because of this ffmpeg can not do sequential decoding. I did not forsee this issue, and it is quite limiting.

Easy solution would be to drop shareable aspect in this PR. There are potential memory savings, but in practice these wouldn't be huge anyway.
Hard and likely ugly solution would be to detect this situation and allocate multiple anims for concurent strips.

Assuming I would take easy solution, question is, whether is it worth to keep anims and strips decoupled in runtime. It did clean up code where loading was done by abstracting multiview handling away. It can be abstracted while anims are owned by strip as well. In fact this PR did that at first, but it was simpler not to do it.

Ok, I have found the issue: Both strips with speed effect share 1 anim, but they need to seek to different position. Because of this ffmpeg can not do sequential decoding. I did not forsee this issue, and it is quite limiting. Easy solution would be to drop shareable aspect in this PR. There are potential memory savings, but in practice these wouldn't be huge anyway. Hard and likely ugly solution would be to detect this situation and allocate multiple anims for concurent strips. Assuming I would take easy solution, question is, whether is it worth to keep anims and strips decoupled in runtime. It did clean up code where loading was done by abstracting multiview handling away. It can be abstracted while anims are owned by strip as well. In fact this PR did that at first, but it was simpler not to do it.
This pull request has changes conflicting with the target branch.
  • source/blender/editors/space_sequencer/sequencer_thumbnails.cc
  • source/blender/makesdna/DNA_sequence_types.h
  • source/blender/sequencer/intern/render.cc
  • source/blender/sequencer/intern/render.hh
  • source/blender/sequencer/intern/sequencer.cc
  • source/blender/sequencer/intern/utils.hh

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u anim-sharing-2:iss-anim-sharing-2
git checkout iss-anim-sharing-2
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Code Documentation
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
6 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#118670
No description provided.