Commit Graph

96 Commits

Author SHA1 Message Date
9ba948a485 Cleanup: style, use braces for blenlib 2019-03-27 13:17:30 +11:00
eb8e656b2b Cleanup: spelling 2019-03-08 17:48:49 +11:00
de13d0a80c doxygen: add newline after \file
While \file doesn't need an argument, it can't have another doxy
command after it.
2019-02-18 08:22:12 +11:00
eef4077f18 Cleanup: remove redundant doxygen \file argument
Move \ingroup onto same line to be more compact and
make it clear the file is in the group.
2019-02-06 15:45:22 +11:00
65ec7ec524 Cleanup: remove redundant, invalid info from headers
BF-admins agree to remove header information that isn't useful,
to reduce noise.

- BEGIN/END license blocks

  Developers should add non license comments as separate comment blocks.
  No need for separator text.

- Contributors

  This is often invalid, outdated or misleading
  especially when splitting files.

  It's more useful to git-blame to find out who has developed the code.

See P901 for script to perform these edits.
2019-02-02 01:36:28 +11:00
4226ee0b71 Cleanup: comment line length (blenlib)
Prevents clang-format wrapping text before comments.
2019-01-15 23:30:31 +11:00
49490e5cfb Merge branch 'master' into blender2.8 2018-12-12 13:02:09 +11:00
e757c4a3be Cleanup: use colon separator after parameter
Helps separate variable names from descriptive text.
Was already used in some parts of the code,
double space and dashes were used elsewhere.
2018-12-12 12:50:58 +11:00
01581d4a1e BLI_task: fix queue in work_and_wait, and support resetting.
To make the pool more usable for running multiple stages of tasks,
fix local queue handling in BLI_task_pool_work_and_wait.

Specifically, after the wait loop the local queue should be empty,
or the wait part of the function contract isn't fulfilled. Instead,
check and run any tasks in queue before the wait loop.

Also, add a new function that resets the suspended state of the pool.
2018-12-04 14:08:50 +03:00
df2635099b Merge branch 'master' into blender2.8 2018-12-04 11:45:22 +01:00
3f31ec8398 Cleanup: Spelling 2018-12-04 11:43:53 +01:00
5c632ced53 Merge branch 'master' into blender2.8 2018-11-20 15:02:13 +01:00
01e8e7dc6d Task scheduler: Optimize parallel loop over lists
The goal is to address performance regression when going from
few threads to 10s of threads. On a systems with more than 32
CPU threads the benefit of threaded loop was actually harmful.

There are following tweaks now:

- The chunk size is adaptive for the number of threads, which
  minimizes scheduling overhead.

- The number of tasks is adaptive to the list size and chunk
  size.

Here comes performance comparison on the production shot:

 Number of threads        DEG time before        DEG time after
       44                     0.09                   0.02
       32                     0.055                  0.025
       16                     0.025                  0.025
       8                      0.035                  0.033
2018-11-20 14:58:17 +01:00
3cf724209f Cleanup, spelling 2018-11-08 15:00:19 +01:00
0ddf3e110e Cleanup: comment blocks 2018-09-02 18:51:31 +10:00
ae57383648 Cleanup: comment blocks 2018-09-02 18:28:27 +10:00
bf8f5f5142 Cleanup: doxygen comments 2018-03-14 02:08:07 +11:00
2aef87bfae Cleanup: rename BLI_thread.h API
- Use BLI_threadpool_ prefix for (deprecated)
  thread/listbase API.
- Use BLI_thread as prefix for other functions.

See P614 to apply instead of manually resolving conflicts.
2018-02-16 01:13:46 +11:00
ccdacf1c9b Cleanup: use '_len' instead of '_size' w/ BLI API
- When returning the number of items in a collection use BLI_*_len()
- Keep _size() for size in bytes.
- Keep _count() for data structures that don't store length
  (hint this isn't a simple getter).

See P611 to apply instead of manually resolving conflicts.
2018-02-15 23:39:08 +11:00
c253fe5e87 Cleanup typo in comment. 2018-01-11 17:55:58 +01:00
518c65460e Task scheduler: Use more const qualifiers 2018-01-10 12:27:43 +01:00
5fe87a0a8c Task scheduler: Use single thread branch when range fits into single chunk 2018-01-09 18:10:47 +01:00
4a3b303bb0 Task scheduler: Fix wrong tasks calculation when chunk size is too big 2018-01-09 18:07:34 +01:00
932d448ae0 Task scheduler: Use const qualifiers in parallel range 2018-01-09 16:09:33 +01:00
8cffb0a141 Task scheduler: Avoid over-allocation of tasks for parallel ranges
This seems to only cause extra rthreading overhead on systems with 10s of
threads, without actually solving anything.
2018-01-09 16:09:33 +01:00
c4e42d70a4 Task scheduler: Add minimum number of iterations per thread in parallel range
The idea is to support following: allow doing parallel for on a small range,
each iteration of which takes lots of compute power, but limit such range to
a subset of threads.

For example, on a machine with 44 threads we can occupy 4 threads to handle
range of 64 elements, 16 elements per thread, where each block of 16 elements
is very complex to compute.

The idea should be to use this setting instead of global use_threading flag,
which is only based on size of array. Proper use of the new flag will improve
threadability.

This commit only contains internal task scheduler changes, this setting is not
used yet by any areas.
2018-01-09 16:09:33 +01:00
3144f0573a Task scheduler: Simplify parallel range function
Basically, split it up and avoid extra abstraction level.
2018-01-09 16:09:33 +01:00
4c4a7e84c6 Task scheduler: Use single parallel range function with more flexible function
Now all the fine-tuning is happening using parallel range settings structure,
which avoid passing long lists of arguments, allows extend fine-tuning further,
avoid having lots of various functions which basically does the same thing.
2018-01-09 16:09:33 +01:00
d2708b0f73 Task scheduler: Get rid of extended version of parallel range callback
Wrap all arguments into TLS type of argument. Avoids some branching and also
makes it easier to extend things in the future.
2018-01-09 16:09:33 +01:00
6efd58dd3e Task scheduler: Clarify why do we need an atomic add of 0 2017-12-22 16:37:25 +01:00
50f1c9a8af Task scheduler: Start with suspended pool to avoid threading overhead on push
The idea is to avoid any threading overhead when we start pushing tasks in a
loop. Similarly to how we do it from the new dependency graph. Gives couple of
percent of speedup here, but also improves scalability.
2017-12-22 12:25:11 +01:00
efb86b712d Add a new parallel looper for MemPool items to BLI_task.
It merely uses the new thread-safe iterators system of mempool, quite
straight forward.

Note that to avoid possible confusion with two void pointers as
parameters of the callback, a dummy opaque struct pointer is used
instead for the second parameter (pointer generated by iteration over
mempool), callback functions must explicitely convert it to expected
real type.

Also added a basic gtest for this new feature.
2017-11-23 21:14:43 +01:00
497e2b3dfa Cleanup: use signed atomic ops when needed. 2017-11-23 16:24:34 +01:00
00c4f49a6d Cleanup: indentation, long lines 2017-06-12 13:38:21 +10:00
a481908232 Task scheduler: Optimize subsequent pushing bunch of tasks
The idea is to accumulate all new tasks in a thread local queue
first without doing any thread synchronization (aka, locks and
conditional variables) and move those tasks to a scheduler queue
once they are all ready. This way we avoid per-task-pool lock
and only have one lock per bunch of tasks.

This is particularly handy when scheduling new dependency graph
node children. Brings FPS of cached simulation from the linked
below file from ~30 to ~50.

See documentation for BLI_task_pool_delayed_push_{begin, end}
and for TaskThreadLocalStorage::do_delayed_push.

Fixes T50027: Rigidbody playback and simulation performance regression with new depsgraph

Thanks Bastien for the review!
2017-05-31 15:44:08 +02:00
2ae6973936 Cleanup: Easier to read constant name 2017-05-31 14:52:45 +02:00
575d6415fb Task scheduler: Fix typo in TLS for pools created from non-main thread
Did a mistake which started to use same TLS for all threads for such pools.

Also added some extra asserts to help catching the bugs.
2017-04-13 13:34:07 +02:00
ed5c3121f5 Task scheduler: Prevent race condition for the pools created from non-main thread
We can not re-use anything for such pools, because we will know nothing about whether
the main thread is sleeping or not. So we identify such threads as 0, but we don't
use main thread's TLS.

This fixes dead-locks and crashes reported by Luca when doing playblasts.
2017-04-12 18:20:17 +02:00
ca5ccf5cd4 Task: Remove non-atomic pool suspended flag assignment
This was done some lines above by atomic fetch and and.
2017-04-04 12:32:15 +02:00
bcc8c04db4 Cleanup: code style & cmake 2017-03-12 02:47:53 +11:00
a095611eb8 Fix T50886: Blender crashes on render
Was a mistake in one of the previous TLS commits.

See comment in the pool_create to see some details why it was crashing.
2017-03-08 09:41:38 +01:00
9e566b06e3 Task scheduler: Add concept of suspended pools
Suspended pools allows to push huge amount of initial tasks
without any threading synchronization and hence overhead.

This gives ~50% speedup of cached rigid body with file from
T50027 and seems to have no negative affect in other scenes
here.
2017-03-07 17:32:01 +01:00
55c2cd85f0 Task scheduler: Initial implementation of local tasks queues
The idea is to allow some amount of tasks to be pushed from working
thread to it's local queue, so we can acquire some work without doing
whole mutex lock.

This should allow us to remove some hacks from depsgraph which was
added there to keep threads alive.
2017-03-07 17:32:01 +01:00
2f722f1a49 Task scheduler: Use real pthread's TLS to access active thread's data
This allows us to avoid TLS stored in pool which gives us advantage of
using pre-allocated tasks pool for the pools created from non-main thread.

Even on systems with slow pthread TLS it should not be a problem because
we access it once at a pool construction time. If we want to use this more
often (for example, to get rid of push_from_thread) we'll have to do much
more accurate benchmark.
2017-03-07 17:32:01 +01:00
a07ad02156 Task scheduler: Refactor the way we store thread-spedific data
Basically move all thread-specific data (currently it's only task
memory pool) from a dedicated array of taskScheduler to TaskThread.
This way we can add more thread-specific data in the future with
less of a hassle.
2017-03-07 17:32:01 +01:00
9522f8acf0 Task scheduler: Remove per-pool threads limit
This feature was adding extra complexity to task scheduling
which required yet extra variables to be worried about to be
modified in atomic manner, which resulted in following issues:

- More complex code to maintain, which increases risks of
  something going wrong when we modify the code.

- Extra barriers and/or locks during task scheduling, which
  causes extra threading overhead.

- Unable to use some other implementation (such as TBB) even for
  the comparison tests.

Notes about other changes.

There are two places where we really had to use that limit.

One of them is the single threaded dependency graph. This will
now construct a single-threaded scheduler at evaluation time.
This shouldn't be a problem because it only happens when using
debugging command line arguments and the code simply don't
run in regular Blender operation.

The code seems a bit duplicated here across old and new
depsgraph, but think it's OK since the old depsgraph is already
gone in 2.8 branch and i don't see where else we might want
to use such a single-threaded scheduler.

When/if we'll want to do so, we can move it to a centralized
single-threaded scheduler in threads.c.

OpenGL render was a bit more tricky to port, but basically we
are using conditional variables to wait background thread to
do all the job.
2017-03-07 17:32:01 +01:00
b498db06eb Task scheduler: Cleanup, use BLI_assert() instead of assert() 2017-03-06 11:33:27 +01:00
2e8398c095 Get rid of BLI_task_pool_stop().
Comments said that function was supposed to 'stop worker threads', but
it absolutely did not do anything like that, was merely wiping out TODO
queue of tasks from given pool (kind of subset of what
`BLI_task_pool_cancel()` does).

Misleading, and currently useless, we can always add it back if we need
it some day, but for now we try to simplify that area.
2017-03-03 17:16:39 +01:00
18c2a44333 Fix ugly mistake in BLI_task - freeing while some tasks are still being processed.
Freeing pool was calling `BLI_task_pool_stop()`, which only clears
pool's tasks that are in TODO queue, whithout ensuring no more tasks
from that pool are being processed in worker threads.

This could lead to use-after-free random (and seldom) crashes.

Now use instead `BLI_task_pool_cancel()`, which does waits for all tasks
being processed to finish, before returning.
2017-03-03 17:12:03 +01:00
17cf423f30 Cleanup: Indentation 2017-03-03 15:53:55 +01:00