This is useful for gcc which does not define sqrtf/powf/... functions with preprocessor and therefore always used sqrt/pow/...
Float functions are generally 20-50% faster than their equivalents for double type.
* MEM_CacheLimitier - Size type to int conversion, should be safe for now (doing my best Bill Gates 640k impression)
* OpenNL CMakeLists.txt - MSVC and GCC have slightly different ways to remove definitions (DEBUG) without the compiler complaining
* BLI_math inlines - The include guard name and inline option macro name should be different. Suppressed warning about not exporting any symbols from inline math library
* BLI string / utf8 - Fixed some inconsistencies between declarations and definitions
* nodes - node_composite_util is apparently not used unless you enable the legacy compositor, so it should not be compiled in that case.
Leaving out changes to BLI_fileops for now, need to do more testing.
Its unlikely you want to do short -> int, int -> float etc, conversion during swapping (if its needed we could have a non type checking macro).
Double that the optimized assembler outbut using SWAP() remains unchanged from before.
This exposed quite a few places where redundant type conversion was going on.
Also remove curve.c's swapdata() and replace its use with swap_v3_v3()
* define for missing hypotf on msvc.
* svd_m4 and pseudoinverse_m4_m4 functions.
* small tweak to perlin noise, use static function instead of macro.
* BLI_linklist_find and BLI_linklist_insert_after functions.
* MALWAYS_INLINE define to force inlining.
* inline some more functions, from math_base and math_vector
* also made some changes to the way inline is done so it can
work for more than one file
* reflect_v3_v3v3 requires input vectors to be normalized now.
* added rgb_to_grayscale
* added zero_v4, copy_v4_v4, swap_v4_v4, is_one_v3
* added box_clip_bounds_m4 to clip a bounding box against a
projection matrix