For non-branched path tracing with a GTX 960 and CUDA 7.5, this gives a small reduction in stack usage but mainly: 8% faster render on BMW, 5% on pabellon, 13% on classroom.
For non-branched path tracing with a GTX 960 and CUDA 7.5, this gives a small reduction in stack usage but mainly: 8% faster render on BMW, 5% on pabellon, 13% on classroom.