This PR modifes customData_update_offsets to respect
memory alignment when laying out BMesh attribute
blocks. While most consumer CPUs (and all x86s) support
nonaligned memory access it can lead to performance
degredation.
The alignment process is simple; large attributes
are layed out first and then small ones, and padding
bytes are added as necessary. Common vector types
are assumed to align to their base type (e.g.
CD_PROP_FLOAT3 has an alignment of 4).
In the future we should probably allocate BMesh attributes
seperately in some kind of shared memory pool collection.
But for now this should improve performance.