Commit Graph

17 Commits

Author SHA1 Message Date
Germano Cavalcante
1775c39986 Fix 'BLI_task_parallel_mempool' keeping 'user_chunk' unchanged
When `BLI_task_parallel_mempool` does not use threading, the
`userdata_chunk` is allocated locally simulating a TLS.

However `func_reduce` is not called so the original chunk is ignored.

`task_parallel_iterator_no_threads` is another function that doesn't call
`func_reduce`. It also ignores `userdata_chunk_local` in the main iterator.

The solution in these cases is not to create a `userdata_chunk_local`.

This fixes T90131

Differential Revision: https://developer.blender.org/D12067
2021-07-29 11:37:10 -03:00
f61f4c89bb Cleanup: reduce variable scope in task_iterator.c
Would have prevented the error in
15cdcb4e90.
2021-07-16 14:39:57 +10:00
e26887598f Fix error using uninitialized state in BLI_task_parallel_mempool
Single threaded operation used the state before it had variables
written into it.

Error in 15cdcb4e90.
2021-07-16 12:07:03 +10:00
15cdcb4e90 BLI_task: add a callback to initialize TLS
Useful when TLS requires it's own allocated structures.
2021-07-15 14:45:46 +10:00
9b89de2571 Cleanup: consistent use of tags: NOTE/TODO/FIXME/XXX
Also use doxy style function reference `#` prefix chars when
referencing identifiers.
2021-07-04 00:43:40 +10:00
fcc844f8fb BLI: use explicit task isolation, no longer part of parallel operations
After looking into task isolation issues with Sergey, we couldn't find the
reason behind the deadlocks that we are getting in T87938 and a Sprite Fright
file involving motion blur renders.

There is no apparent place where we adding or waiting on tasks in a task group
from different isolation regions, which is what is known to cause problems. Yet
it still hangs. Either we do not understand some limitation of TBB isolation,
or there is a bug in TBB, but we could not figure it out.

Instead the idea is to use isolation only where we know we need it: when
holding a mutex lock and then doing some multithreaded operation within that
locked region. Three places where we do this now:
* Generated images
* Cached BVH tree building
* OpenVDB lazy grid loading

Compared to the more automatic approach previously used, there is the downside
that it is easy to miss places where we need isolation. Yet doing it more
automatically is also causing unexpected issue and bugs that we found no
solution for, so this seems better.

Patch implemented by Sergey and me.

Differential Revision: https://developer.blender.org/D11603
2021-06-15 17:28:44 +02:00
bcefce33f2 BLI_mempool: split thread-safe iteration into the private API
Splitting out thread safe iteration logic means regular iteration
isn't checking for the thread-safe pointer each step.

This gives a small but measurable overall performance gain of 2-3%
when redrawing a high-poly mesh.

Ref D11564

Reviewed By: mont29
2021-06-11 00:31:16 +10:00
965bd53e02 Cleanup: use doxy sections for task_iterator.c 2021-06-10 02:22:46 +10:00
14f3b2cdad BLI_task: add TLS support to BLI_task_parallel_mempool
Support thread local storage for BLI_task_parallel_mempool,
as well as support for the reduce and free callbacks.

mempool_iter_threadsafe_* functions have been moved into a private
header thats only shared between task_iterator.c and BLI_mempool.c
so the TLS can be made part of the iterator array without having to
rely on passing in struct offsets.

Add test task.MempoolIterTLS that ensures reduce and free
are working as expected.

Reviewed By: mont29

Ref D11548
2021-06-10 00:55:04 +10:00
dd98f6b55c Fix missing free calls for task iterator
A single threaded task with thread data over 8192 bytes would leak.
While this didn't happen in practice, it could cause issues in the
future.

The free call for `task_parallel_iterator_do` wasn't running
if callbacks weren't set, also not an issue in practice but avoids
potential problems in the future too.
2021-06-09 20:33:40 +10:00
ed1fc9d96b BLI: support disabling task isolation in task pool
Under some circumstances using task isolation can cause deadlocks.
Previously, our task pool implementation would run all tasks in an
isolated region. Now using task isolation is optional and can be
turned on/off for individual task pools.

Task pools that spawn new tasks recursively should never enable
task isolation. There is a new check that finds these cases at runtime.
Right now this check is disabled, so that this commit is a pure refactor.
It will be enabled in an upcoming commit.

This fixes T88598.

Differential Revision: https://developer.blender.org/D11415
2021-06-08 10:39:33 +02:00
14b2a35c8b Cleanup: use parenthesis for if statements in macros 2020-09-19 16:28:17 +10:00
d8a3f3595a Task: Use TBB as Task Scheduler
This patch enables TBB as the default task scheduler. TBB stands for Threading Building Blocks and is developed by Intel. The library contains several threading patters. This patch maps blenders BLI_task_* function to their counterpart. After this patch we can add more patterns. A promising one is TBB:graph that can be used for depsgraph, draw manager and compositor.

Performance changes depends on the actual hardware. It was tested on different hardwares from laptops to workstations and we didn't detected any downgrade of the performance.
* Linux Xeon E5-2699 v4 got FPS boost from 12 to 17 using Spring's 04_010_A.anim.blend.
* AMD Ryzen Threadripper 2990WX 32-Core Animation playback goes from 9.5-10.5 FPS to 13.0-14.0 FPS on Agent 327 , 10_03_B.anim.blend.

Reviewed By: brecht, sergey

Differential Revision: https://developer.blender.org/D7475
2020-04-30 08:09:21 +02:00
3b47f335c6 BLI: remove TaskParallelRangePool
This is not currently used and will take some work to support with TBB, so
remove it until we have a new implementation based on TBB.

Fixes T76005, parallel range pool tests failing.

Ref D7475
2020-04-23 15:39:34 +02:00
2d6ad88466 CleanUp: Renamed BLI_task_pool_userdata to BLI_task_pool_user_data
In preparation for {D7475}
2020-04-21 15:37:36 +02:00
d923fb784f Task: Separate Finalize into Reduce And Free
In preparation of TBB we need to split the finalize function into reduce
and free. Reduce is used to combine results and free for freeing any
allocated memory.

The reduce function is called to join user data chunk into another, to reduce the
result to the original userdata_chunk memory. These functions should have no side
effects so that they can be run on any thread.
The free functions should free data created during execution (TaskParallelRangeFunc).

Original patch by Brecht van Lommel
{rB61f49db843cf5095203112226ae386f301be1e1a}.

Reviewed By: Brecht van Lommel, Bastien Montagne

Differential Revision: https://developer.blender.org/D7394
2020-04-17 16:06:54 +02:00
78f56d5582 TaskScheduler: Minor Preparations for TBB
Tasks: move priority from task to task pool {rBf7c18df4f599fe39ffc914e645e504fcdbee8636}
Tasks: split task.c into task_pool.cc and task_iterator.c {rB4ada1d267749931ca934a74b14a82479bcaa92e0}

Differential Revision: https://developer.blender.org/D7385
2020-04-09 19:18:14 +02:00