The utility counts the number of occurences of each index in an array.
This happens to build offsets for mesh topology maps, or to count the
number of connected elements. Some users are geometry nodes,
the subdivision draw cache, and mesh to curve conversion.
Now that the utility is in one place, it's reasonable to optimize it
with compiler flags. On GCC, unrolling the loop gave me a 1.9x
performance improvement, counting the number corners for each
vertex in a 4 million vertex mesh went from 7.4 to 3.9 ms.
In a couple places this improves code reuse, sharing the
implementation of the pattern where it was repeated before.