Simplify vertex normal calculation by moving the main normal accumulation function to operate on vertices instead of faces. Using faces had the down side that it needed to zero, accumulate and normalize the vertex normals in 3 separate passes, accumulating also needed a spin-lock for thread since the face would write it's normal to all of it's vertices which could be shared with other faces. Now a single loop over vertices is performed without locking. This gives 5-6% speedup calculating all normals. This also simplifies partial updates, fixing a problem where all connected faces were being read from when calculating normals. While this could have been resolved separately, it's simpler to operate on vertices directly.