A single diagonal axis was used for sorting coordinates, the algorithm relied on users not having vertices axis aligned. Use BLI_kdtree to remove doubles instead. Overall speed varies, it's more predictable than the previous method. Some typical tests gave speedup of ~1.4x - 1.7x.