Commit Graph

3486 Commits

Author SHA1 Message Date
37fc4b575f Fix FPE exception happening when converting linear<->srgb using SIMD 2016-06-08 16:00:34 +02:00
6ca6d3c4fd Cleanup: typo 2016-06-08 22:31:35 +10:00
e02679f71e Cleanup: typos 2016-06-08 22:25:23 +10:00
0a029e3dd1 BLI_array_store: move helper functions into their own API 2016-06-08 19:12:23 +10:00
64663b1f73 Cleanup: warnings in previous commit 2016-06-02 19:48:45 +10:00
cc7b817099 Minor edits to last commit
Failed with chunk merging disabled
2016-06-02 18:42:09 +10:00
0ce98b1ffb BLI_array_store: Move writing many chunks into a function
Minor optimization, avoid some checks each iteration.
2016-06-02 18:13:13 +10:00
7980c7c10f BLI_array_store: store max size in BArrayInfo 2016-06-02 18:05:11 +10:00
0bd8d6d194 Add extra validation checks to array-store 2016-06-02 16:41:41 +10:00
57d5ddc251 Revert "BLI_ghash: Fix initial over-allocation of mempool chunks."
Useless change in fact, sorry for the noise.

This reverts commit b08473680e.
2016-06-01 17:38:50 +02:00
2c5dc66d5e Optimize mempool iteration
Around ~10% improvement in own tests.
2016-06-02 00:07:18 +10:00
8cf8679b53 Revert "Correct invalid pointer-pair compare check"
This reverts commit d5e0e681ce.

Tsk, these functions return false on a match.
2016-06-01 23:08:40 +10:00
b08473680e BLI_ghash: Fix initial over-allocation of mempool chunks.
Code intended to create only one pool by default here, but code in `mempool_maxchunks()` would make it two.
2016-06-01 12:58:59 +02:00
aedeca7d1c BLI_mempool: Use an 'odd' FREEWORD for big/little endian
This also changes freeword to an intptr_t to ensure
not only the first 4 bits of a pointer are tested on 64bit systems.
2016-06-01 02:54:47 +10:00
3df30c1a6e Cleanup: parenthesize defines 2016-06-01 00:19:01 +10:00
665cb1b291 Change the hash-table to be 3x total items to hash 2016-05-30 18:00:03 +10:00
bd6a64ced7 Remove accidental static var 2016-05-30 17:27:06 +10:00
53b60eed45 Add BLI_array_store copy-on-write API
This supported in-memory de-duplication,
useful to avoid in-efficient memory use when storing multiple, similar arrays.
2016-05-30 16:18:24 +10:00
d5e0e681ce Correct invalid pointer-pair compare check 2016-05-26 22:20:12 +10:00
a17cba339c BLI_math: Add function to calculate circular cubic curve tangents 2016-05-23 21:35:54 +10:00
22ff9c5568 Fix T48497: Stupid typo in recent own BLI_task forloop work that broke non-parallelized case. 2016-05-22 18:35:44 +02:00
2630207ada Fix GCC/Linux build error after finite/isfinite changes. 2016-05-17 23:40:25 +02:00
21fddf7d1c C99/C++11: replace deprecated finite() by isfinite(). 2016-05-17 21:39:16 +02:00
688858d3a8 BLI_task: Add new 'BLI_task_parallel_range_finalize()'.
Together with the extended loop callback and userdata_chunk, this allows to perform
cumulative tasks (like aggregation) in a lockfree way using local userdata_chunk to store temp data,
and once all workers have finished, to merge those userdata_chunks in the finalize callback
(from calling thread, so no need to lock here either).

Note that this changes how userdata_chunk is handled (now fully from 'main' thread,
which means a given worker thread will always get the same userdata_chunk, without
being re-initialized anymore to init value at start of each iter chunk).
2016-05-16 17:15:18 +02:00
5a7429c363 BLI_task: Add back lost 'push_from_thread' change to BLI_task_parallel_range() & co. 2016-05-16 17:00:15 +02:00
575d7a9666 BLI_task: make foreach loop index hleper lockfree, take II.
New code is actually much, much better than first version, using 'fetch_and_add' atomic op
here allows us to get rid of the loop etc.

The broken CAS issue remains on windows, to be investigated...
2016-05-16 15:57:19 +02:00
bb7da630ba Fix T48422: Revert "BLI_task: nano-optimizations to BLI_task_parallel_range feature."
There are some serious issues under windows, causing deadlocks somehow (not reproducible under linux so far).

Until further investigation over why this happens, better to revert to previous
spin-locked behavior.

This reverts commits a83bc4f597 and 98123ae916.
2016-05-15 21:14:40 +02:00
a83bc4f597 Fix an error in new lockfree parallel_range_next_iter_get() helper.
Reading the shared state->iter value after storing it in the 'reference' var could in theory
lead to a race condition setting state->iter value above state->stop, which would be 'deadly'.

This **may** be the cause of T48422, though I was not able to reproduce that issue so far.
2016-05-14 18:06:05 +02:00
868cfc5a4a BLI_task: add support for listbase parallelized for loops.
Code by @sergey, with small edits and doc by @mont29.
2016-05-13 12:06:15 +02:00
ba6519f0a7 BLI_math: add 'equals_m4m4' (and 'm3' variant) helpers. 2016-05-13 12:06:15 +02:00
5d93836a61 Cleanup: only use r_ prefix for return args 2016-05-12 04:36:16 +10:00
35531657e5 BLI_kdopbvh: Use distance for BLI_bvhtree_ray_cast_all
Pass distance argument so its possible to limit the range we get all hits from.

Other changes:

- Use boundbox test before calling callback, avoids redundant calls.
- Remove meaningless return value.
- Add doc string, explaining purpose of this function.
2016-05-11 15:01:27 +10:00
98123ae916 BLI_task: nano-optimizations to BLI_task_parallel_range feature.
This commit makes use of new taskpool feature (instead of allocating own tasks),
and removes the spinlock used to generate chunks (using atomic ops instead).

In best cases (dynamic scheduled loop with light processing func callback), we
get a few percents of speedup, in most cases there is no sensible enhancement.
2016-05-10 17:57:53 +02:00
335274192e Revert "Task scheduler: Avoid mutex lock in number manipulation functions"
Appears mutex was guarateeing number of tasks is not modified at moments
when it's not expected. Removing those mutexes resulted in some hard-to-catch
locks where worker thread were waiting for work by all the tasks were already
done.

This reverts commit a1d8fe052c.
2016-05-10 15:43:03 +02:00
a1d8fe052c Task scheduler: Avoid mutex lock in number manipulation functions
It seems using atomic operations here we can avoid having mute without
breaking anything.

Thanks Bastien for double-checking the changes!
2016-05-10 14:59:19 +02:00
fcc2175710 Fix own mistake in rBd617de965ea20e5d5 from late December 2015.
Brain melt here, intention was to reduce number of tasks in case we have not much chunks of data to loop over,
not to increase it!

Note that this only affected dynamic scheduling.
2016-05-10 13:10:21 +02:00
7efa34d078 Task scheduler: Add thread-aware task push routines
This commit implements new function BLI_task_pool_push_from_thread()
who's main goal is to have less parasitic load on the CPU bu avoiding
memory allocations as much as possible, making taks pushing cheaper.

This function expects thread ID, which must be 0 for the thread from
which pool is created from (and from which wait_work() is called) and
for other threads it mush be the ID which was sent to the thread working
function.

This reduces allocations quite a bit in the new dependency graph,
hopefully gaining some visible speedup on a fewzillion core machines
(on my own machine can only see benefit in profiler, which shows
significant reduce of time wasted in the memory allocation).
2016-05-10 10:01:24 +02:00
8b13555b24 Docs: comment polyfill2d functions 2016-05-09 23:47:57 +10:00
9ac35be63a Task scheduler: Don't calloc in performance-critical areas
Majority of the fields are being overwritten anyway, so calloc it
kinda waste of CPU ticks.
2016-05-09 14:54:24 +02:00
6d402610c1 Correct error in wrapped array-span-iteration 2016-05-07 23:48:53 +10:00
bc1a7d9283 Cleanup: warnings
Values set but not used
2016-05-06 16:49:25 +10:00
d12378da11 Cleanup: style 2016-05-06 06:34:25 +10:00
c5a26bef5d Cleanup: rename getepsilon -> get_epsilon 2016-05-06 06:14:36 +10:00
cc650c3d07 Add asserts to check bvhutils args are correct
Would have prevented previous error going unnoticed.
2016-05-06 06:14:36 +10:00
0b5a0d8412 Transform/Snap: EditMesh/BKE_bvhutils API improvements
Separate the creation of trees from EditMesh from the creation of trees from DerivedMesh.
This was meant to simplify the API, but didn't work out so well.

`bvhtree_from_mesh_*` actually is working as `bvhtree_from_derivedmesh_*`.
This is inconsistent with the trees created from EditMesh. Since for create them does not use the DerivedMesh.

In such cases the dm is being used only to cache the tree in the struct DerivedMesh. What is immediately released once
bvhtree is being used in functions that change(tag) the DM cleaning the cache.

- Use a filter function so users of SnapObjectContext can define how edit-mesh elements are handled.
- Remove em_evil.
- bvhtree of EditMesh is now really cached in the snap functions.
- Code becomes organized and easier to maintain.

This is an important patch for future improvements in snapping functions.
2016-05-06 05:01:51 +10:00
88b72925d0 Optimize linear<->sRGB conversion for SSE2 processors
Using SSE2 intrinsics when available for this kind of conversions.

It's not totally accurate, but accurate enough for the purposes where
we're using direct colorspace conversion by-passing OCIO.

Partially based on code from Cycles, partially based on other online
articles:

  https://stackoverflow.com/questions/6475373/optimizations-for-pow-with-const-non-integer-exponent

Makes projection painting on hi-res float textures smoother.

This commit also enables global SSE2 in Blender. It shouldn't
bring any regressions in supported hardware (we require SSE2 since
2.64 now), but should keep an eye on because compilers might have
some bugs with that (unlikely, but possible).
2016-05-05 19:46:06 +02:00
bb6fbc64ae Docs: scanfill.c purpose 2016-05-06 00:45:38 +10:00
3dcc05c591 Cleanup: no need to cast for pointer comparison 2016-05-03 18:20:33 +10:00
48d3a8b54b Math Lib: inline project_plane_v3_v3v3 2016-05-03 13:48:00 +10:00
915e9eeff1 BLI_array_utils: helper for stepping over contiguous ranges 2016-05-02 18:49:22 +10:00