Previously, we had one global `GPU_matrix` stack, so the API was not
thread safe. This patch makes the stack be per `GPUContext`, effectively
making it local per thread (`GPUContext` is located in thread local
storage).
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D5405