Cycles: test code for sse 4.1 kernel and alignment for some vector types.

This is mostly work towards enabling the __KERNEL_SSE__ option to start using SIMD operations for vector math operations. This 4.1 kernel performes about 8% faster with that option but overall is still slower than without the option. WITH_CYCLES_OPTIMIZED_KERNEL_SSE41 is the cmake flag for testing this kernel. Alignment of int3, int4, float3, float4 to 16 bytes seems to give a slight 1-2% speedup on tested systems with the current kernel already, so is enabled now.
2013-11-22 14:16:47 +01:00
parent 5feb0d2bfe
commit e3a79258d1
11 changed files with 187 additions and 6 deletions
--- a/intern/cycles/util/util_system.h
+++ b/intern/cycles/util/util_system.h
@@ -26,6 +26,7 @@ string system_cpu_brand_string();
 int system_cpu_bits();
 bool system_cpu_support_sse2();
 bool system_cpu_support_sse3();
+bool system_cpu_support_sse41();

 CCL_NAMESPACE_END