Commit Graph

136 Commits

Author SHA1 Message Date
Stefan Werner
9d282d7a8d macOS: Replaced OSSpinLock with os_unfair_lock.
OSSplinLock is a deprecated API, os_unfair_lock is its successor.
This reduces the number of warnings when building on macOS.
2019-09-14 20:23:29 +02:00
6529d20d79 Cleanup: spelling in comments 2019-06-12 09:43:49 +10:00
30d9366d17 Fix (unreported) Broken BLI_threadapi_exit().
Function would not clear the static scheduler pointer, which lead to
crash (mem use after free) when trying to re-init and use the task API
again. Should not happen in Blender itself, but could in other cases
(like some future gtests ;) ).
2019-06-04 23:51:03 +02:00
e12c08e8d1 ClangFormat: apply to source, most of intern
Apply clang format as proposed in T53211.

For details on usage and instructions for migrating branches
without conflicts, see:

https://wiki.blender.org/wiki/Tools/ClangFormat
2019-04-17 06:21:24 +02:00
b5d1e0ad1e Cleanup: spelling 2019-04-10 00:38:47 +10:00
9ba948a485 Cleanup: style, use braces for blenlib 2019-03-27 13:17:30 +11:00
dd61787b25 Cleanup: unused function warning 2019-03-06 16:21:24 +11:00
76608f5ec5 Fix T60585: threadripper CPU only using 16 threads for e.g. sculpting.
This reverts the changes from ce927e1 to put the main and job threads on
node 0. The problem is that all threads created as children from these
threads will inherit the NUMA node and so will end up on the same node.
This can be fixed case-by-case by assigning the NUMA node for every child
thread, however this is difficult for external libraries and OpenMP, and
out of our control for plugins like external renderers.
2019-03-05 12:46:05 +01:00
de13d0a80c doxygen: add newline after \file
While \file doesn't need an argument, it can't have another doxy
command after it.
2019-02-18 08:22:12 +11:00
ffd0fee97c Cleanup: comment indentation & spelling 2019-02-11 10:51:25 +11:00
eef4077f18 Cleanup: remove redundant doxygen \file argument
Move \ingroup onto same line to be more compact and
make it clear the file is in the group.
2019-02-06 15:45:22 +11:00
65ec7ec524 Cleanup: remove redundant, invalid info from headers
BF-admins agree to remove header information that isn't useful,
to reduce noise.

- BEGIN/END license blocks

  Developers should add non license comments as separate comment blocks.
  No need for separator text.

- Contributors

  This is often invalid, outdated or misleading
  especially when splitting files.

  It's more useful to git-blame to find out who has developed the code.

See P901 for script to perform these edits.
2019-02-02 01:36:28 +11:00
67d77f47f0 Fix leak in CPU brand check 2018-11-29 08:22:15 +11:00
ce927e15e0 Tweaks for threading schedule for Threadripper2 and EPYC
The idea is to make main thread and job threads to be scheduled
on CPU dies which has direct access to memory (those are NUMA
nodes 0 and 2).

We also do this for new EPYC CPUs since their NUMA nodes 1 and 3
do have access but only to a higher range DDR slots. By preferring
nodes 0 and 2 on EPYC we make it so users with partially filled
DDR slots has fast memory access.

One thing which is not really solved yet is localization of
memory allocation: we do not guarantee that memory is allocated
on the closest to the NUMA node DDR slot and hope that memory
manager of OS is acting in favor of us.
2018-11-28 14:41:22 +01:00
af36dd4664 Cleanup: trailing newlines 2018-06-29 08:02:49 +02:00
5513da65b2 Cleanup: trailing space for BLI 2018-06-17 16:32:54 +02:00
75fc1c3507 Cleanup: trailing whitespace (comment blocks)
Strip unindented comment blocks - mainly headers to avoid conflicts.
2018-06-01 18:19:39 +02:00
ef502854fe Threads: add spinlock hit for hyperthreading processors on Windows.
Suggested by Percy Ross Tiglao.
2018-05-26 22:35:30 +02:00
2aef87bfae Cleanup: rename BLI_thread.h API
- Use BLI_threadpool_ prefix for (deprecated)
  thread/listbase API.
- Use BLI_thread as prefix for other functions.

See P614 to apply instead of manually resolving conflicts.
2018-02-16 01:13:46 +11:00
ccdacf1c9b Cleanup: use '_len' instead of '_size' w/ BLI API
- When returning the number of items in a collection use BLI_*_len()
- Keep _size() for size in bytes.
- Keep _count() for data structures that don't store length
  (hint this isn't a simple getter).

See P611 to apply instead of manually resolving conflicts.
2018-02-15 23:39:08 +11:00
96415cb52a Code cleanup: fix harmless compiler warning. 2017-11-20 23:32:06 +01:00
55696b56d9 Fix T53068: AMD Threadripper not working well with Blender
The issue was caused by SpinLock implementation in old pthreads we ar eusing on
Windows. Using newer one (2.10-rc) demonstrates same exact behavior. But likely
using own atomics and memory barrier based implementation solves the issue.

A bit annoying that we need to change such a core part of Blender just to make
specific CPU happy, but it's better to have artists happy on all computers.

There is no expected downsides of this change, but it is so called "works for
me" category. Let's see how it all goes.
2017-11-14 12:21:15 +01:00
e61ba39e7c Ensure task scheduler exists before any threading starts in Blender 2017-04-26 16:00:02 +02:00
a83a68b9b6 Threads: Use atomics instead of spin when entering threaded malloc 2017-03-02 12:42:34 +01:00
d70ffd375f BLI_threads: add an helper to wait on a condition using a global mutex.
Also, factorized internal code to get global mutex from its ID.
2015-08-10 15:03:31 +02:00
aeda4dca77 Threads: Cache result of syscall when querying number of system threads
Number of system threads is quite difficult to change without need of blender
restart, so we can cache result of the systcalls (which are not really cheap)
in order to be able to call BLI_system_thread_count() without worrying of
performance issues in that function.

Reviewers: campbellbarton

Differential Revision: https://developer.blender.org/D1342
2015-06-20 22:10:30 +02:00
0f171d4a25 BLI_threads Queue: add BLI_thread_queue_is_empty().
Avoids counting the whole queue when we only want to check whether it is empty or not!
2015-06-19 12:31:26 +02:00
fd8b6021c4 Fix race condition
Exposed when checking on T44871
2015-06-03 11:01:44 +10:00
b88597c218 Make switching to threaded malloc safe to be called from threads
For a long time this function was only intended to be used from the main thread,
but since out implementation of parallel range (which is currently only used by
mesh deform modifier) we might want to switch to threaded alloc from object
update thread.

Now we're using spinlock around the check, which makes the code safe to be used
from all over the place.

We might consider using a bit of atomics operations magic there, but it's not so
much important for now, this code is not used in the performance critical code
path.
2015-05-18 16:40:51 +05:00
Dalai Felinto
d5f1b9c222 Multi-View and Stereo 3D
Official Documentation:
http://www.blender.org/manual/render/workflows/multiview.html

Implemented Features
====================
Builtin Stereo Camera
* Convergence Mode
* Interocular Distance
* Convergence Distance
* Pivot Mode

Viewport
* Cameras
* Plane
* Volume

Compositor
* View Switch Node
* Image Node Multi-View OpenEXR support

Sequencer
* Image/Movie Strips 'Use Multiview'

UV/Image Editor
* Option to see Multi-View images in Stereo-3D or its individual images
* Save/Open Multi-View (OpenEXR, Stereo3D, individual views) images

I/O
* Save/Open Multi-View (OpenEXR, Stereo3D, individual views) images

Scene Render Views
* Ability to have an arbitrary number of views in the scene

Missing Bits
============
First rule of Multi-View bug report: If something is not working as it should *when Views is off* this is a severe bug, do mention this in the report.

Second rule is, if something works *when Views is off* but doesn't (or crashes) when *Views is on*, this is a important bug. Do mention this in the report.

Everything else is likely small todos, and may wait until we are sure none of the above is happening.

Apart from that there are those known issues:
* Compositor Image Node poorly working for Multi-View OpenEXR
(this was working prefectly before the 'Use Multi-View' functionality)
* Selecting camera from Multi-View when looking from camera is problematic
* Animation Playback (ctrl+F11) doesn't support stereo formats
* Wrong filepath when trying to play back animated scene
* Viewport Rendering doesn't support Multi-View
* Overscan Rendering
* Fullscreen display modes need to warn the user
* Object copy should be aware of views suffix

Acknowledgments
===============
* Francesco Siddi for the help with the original feature specs and design
* Brecht Van Lommel for the original review of the code and design early on
* Blender Foundation for the Development Fund to support the project wrap up

Final patch reviewers:
* Antony Riakiotakis (psy-fi)
* Campbell Barton (ideasman42)
* Julian Eisel (Severin)
* Sergey Sharybin (nazgul)
* Thomas Dinged (dingto)

Code contributors of the original branch in github:
* Alexey Akishin
* Gabriel Caraballo
2015-04-06 10:40:12 -03:00
835765926f Final overlooked cleanup for last commit 2014-04-27 18:44:23 +02:00
f3798fa45e Revert the testing sculpt openmp thread control and limit for OSX to physical threads as in 2.70a tag 2014-04-27 18:39:03 +02:00
1b9db9911d Code cleanup: use bools
also rename BLI_omp_thread_count -> BLI_system_thread_count_omp
2014-04-17 16:04:28 +10:00
62dc18c717 Fix an unused function warning without openmp present, some typos 2014-04-01 11:23:28 +02:00
277fb1a31f Sculpt/dyntopo: Make the omp threads configurable to overcome performance issues
- autodetect optimal default, which typically avoids HT threads
- can store setting in .blend per scene
- this does not touch general omp max threads, due i found other areas where the calculations are fitting for huge corecount
- Intel notes, some of the older generation processors with HyperThreading would not provide significant performance boost for FPU intensive applications. On those systems you might want to set OMP_NUM_THREADS = total number of cores (not total number of hardware theads).
2014-03-31 13:51:49 +02:00
f01d19431d Fix T38873: Crashing on undo of ocean modifier.
Issue of this bug is that most part of fftw is not thread safe, only compute-intensive fftw_execute & co are.

Since smoke was affected by this issue as well, a global fftw mutex was added to BLI_threads.
Audaspace also uses fftw in one of its readers (AUD_BandPassReader.cpp),
but this is not an issue currently since this code is disabled in CMake/scons files.

There was another threading issue with smoke, we need to copy dm used by emit_from_derivedmesh(),
as it is modified by this func.

Reviewers: sergey, brecht

Reviewed By: brecht

CC: brecht

Differential Revision: https://developer.blender.org/D374
2014-03-01 21:05:50 +01:00
a84bcea070 OSX/scons: allow for compiling with clang-openmp-3.4
See: http://clang-omp.github.io
+ fix a longstanding bad include in darwin-config
2014-02-09 18:03:13 +01:00
b3afbcab8f ListBase API: add utility api funcs for clearing and checking empty 2014-02-08 06:24:05 +11:00
f9d5bccb06 code cleanup: spelling 2013-10-31 23:52:44 +00:00
48c1e0c0fc spelling: use American spelling for canceled 2013-10-26 01:06:19 +00:00
f0dcff9aa9 Task scheduler ported form CYcles to C
Replaces ThreadedWorker and is gonna to be used
for threaded object update in the future and
some more upcoming changes.

But in general, it's to be used for any task
based subsystem in Blender.

Originally written by Brecht, with some fixes
and tweaks by self.
2013-10-12 14:08:59 +00:00
c0f8e15295 Speedup for guarded allocator
- Re-arrange locks, so no actual memory allocation
  (which is relatively slow) happens from inside
  the lock. operation system will take care of locks
  which might be needed there on it's own.

- Use spin lock instead of mutex, since it's just
  list operations happens from inside lock, no need
  in mutex here.

- Use atomic operations for memory in use and total
  used blocks counters.

This makes guarded allocator almost the same speed
as non-guarded one in files from Tube project.

There're still MemHead/MemTail overhead which might
be bad for CPU cache utilization
2013-08-19 10:51:40 +00:00
3ce280e825 Fix #35960, #36044: blender internal viewport rendering crash while editing data.
Now the viewport rendering thread will lock the main thread while it is exporting
objects to render data. This is not ideal if you have big scenes that might block
the UI, but Cycles does the same, and it's fairly quick because the same evaluated
mesh can be used as for viewport drawing. It's the only way to get things stable
until the thread safe dependency graph is here.

This adds a mechanism to the job system for jobs to lock the main thread, using a
new 'ticket mutex lock' which is a mutex lock that gives priority to the first
thread that tries to lock the mutex.

Still to solve: undo/redo crashes.
2013-07-08 17:56:51 +00:00
7759b2743a code cleanup: replace PARALLEL define with _OPENMP 2013-05-20 16:15:16 +00:00
a07dcd67eb Fix #35240: command line -t number of threads option did not work for cycles.
Now it works for blender internal, cycles and other multithreading code in
Blender in both background and UI mode.
2013-05-08 13:23:17 +00:00
dbeec2be86 Fix #34783: smoke simulation crash when changing frame while preview rendering.
Added a mutex lock for smoke data access. The render was already working with a
copy of the volume data, so it's just a short lock to copy things and should not
block the UI much.
2013-04-24 17:31:09 +00:00
5d234f0a0b code cleanup: includes 2013-04-01 11:27:47 +00:00
4f3ca854e1 Fix various warnings with clang build, and adjust cmake clang warnings flags
to include a few more that gcc is using too.
2013-02-26 21:58:06 +00:00
5e739ddae2 Added some code which helps troubleshooting issues caused by
non-threadsafe usage of guarded allocator.

Also added small chunk of code to check consistency of begin/end
threaded malloc.

All this additional checks are commented and wouldn't affect on
builds, however found them helpful to troubleshoot issues so
decided to commit it to SVN.
2013-01-24 08:14:05 +00:00
5c6f6301b0 Image thread safe improvements
This commit makes BKE_image_acquire_ibuf referencing result, which means once
some area requested for image buffer, it'll be guaranteed this buffer wouldn't
be freed by image signal.

To de-reference buffer BKE_image_release_ibuf should now always be used.

To make referencing working correct we can not rely on result of
image_get_ibuf_threadsafe called outside from thread lock. This is so because
we need to guarantee getting image buffer from list of loaded buffers and it's
referencing happens atomic. Without lock here it is possible that between call
of image_get_ibuf_threadsafe and referencing the buffer IMA_SIGNAL_FREE would
be called. Image signal handling too is blocking now to prevent such a
situation.

Threads are locking by spinlock, which are faster than mutexes. There were some
slowdown reports in the past about render slowdown when using OSX on Xeon CPU.
It shouldn't happen with spin locks, but more tests on different hardware would
be really welcome. So far can not see speed regressions on own computers.

This commit also removes BKE_image_get_ibuf, because it was not so intuitive
when get_ibuf and acquire_ibuf should be used.

Thanks to Ton and Brecht for discussion/review :)
2012-11-15 15:59:58 +00:00