After looking into task isolation issues with Sergey, we couldn't find the
reason behind the deadlocks that we are getting in T87938 and a Sprite Fright
file involving motion blur renders.
There is no apparent place where we adding or waiting on tasks in a task group
from different isolation regions, which is what is known to cause problems. Yet
it still hangs. Either we do not understand some limitation of TBB isolation,
or there is a bug in TBB, but we could not figure it out.
Instead the idea is to use isolation only where we know we need it: when
holding a mutex lock and then doing some multithreaded operation within that
locked region. Three places where we do this now:
* Generated images
* Cached BVH tree building
* OpenVDB lazy grid loading
Compared to the more automatic approach previously used, there is the downside
that it is easy to miss places where we need isolation. Yet doing it more
automatically is also causing unexpected issue and bugs that we found no
solution for, so this seems better.
Patch implemented by Sergey and me.
Differential Revision: https://developer.blender.org/D11603
Under some circumstances using task isolation can cause deadlocks.
Previously, our task pool implementation would run all tasks in an
isolated region. Now using task isolation is optional and can be
turned on/off for individual task pools.
Task pools that spawn new tasks recursively should never enable
task isolation. There is a new check that finds these cases at runtime.
Right now this check is disabled, so that this commit is a pure refactor.
It will be enabled in an upcoming commit.
This fixes T88598.
Differential Revision: https://developer.blender.org/D11415
This patch adds the base code needed to make the full-frame system work for both current tiled/per-pixel implementation of operations and full-frame.
Two execution models:
- Tiled: Current implementation. Renders execution groups in tiles from outputs to input. Not all operations are buffered. Runs the tiled/per-pixel implementation.
- FullFrame: All operations are buffered. Fully renders operations from inputs to outputs. Runs full-frame implementation of operations if available otherwise the current tiled/per-pixel. Creates output buffers on first read and free them as soon as all its readers have finished, reducing peak memory usage of complex/long trees. Operations are multi-threaded but do not run in parallel as Tiled (will be done in another patch).
This should allow us to convert operations to full-frame in small steps with the system already working and solve the problem of high memory usage.
FullFrame breaking changes respect Tiled system, mainly:
- Translate, Rotate, Scale, and Transform take effect immediately instead of next buffered operation.
- Any sampling is always done over inputs instead of last buffered operation.
Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D11113
Initial report was mentioning the Classroom demo scene, but this is
probably because the scene was pre-configured to be used with OpenCL.
Would expect any OpenCL compositing to be failing prior to this fix.
The reason why crash was happening is due to OpenCL queue being
released from OpenCLDevice destructor. Is not that obvious, but
when Vector (including std::vector) is holding elements by value
a destructor will be called on "old" memory when vector capacitance
changes.
Solved by making forbidding copy semantic for compositor devices and
forcing move semantic to be used.
Also use emplace semantic in the devices vector initialization.
WorkScheduler task model deletes work packages after executing them. The other models don't do so. All models should handle packages the same way.
Reviewed By: #compositing, jbakker
Differential Revision: https://developer.blender.org/D11102
WorkPackages struct was created when scheduled. This patch keeps the
WorkPackages around and stores additional data with the workpackages.
The speedup is to small to notice, but it is needed as preparation
to introduce a faster scheduling method.
- Use constexpr for better readability.
- Split in functions per backend.
- Split work scheduler global struct in smaller structs.
- Replaced std::vector with blender::Vector.
- Removed threading defines in COM_defines.h