Fix two issues in the previous implementation:
* Only power-of-two prefixes were progressively stratified, not suffixes.
This resulted in unnecessarily increased noise when using non-power-of-two
sample counts.
* In order to try to get away with just a single sample pattern, the code
used a combination of sample index shuffling and Cranley-Patterson rotation.
Index shuffling is normally fine, but due to the sample patterns themselves
not being quite right (as described above) this actually resulted in
additional increased noise. Cranley-Patterson, on the other hand, always
increases noise with randomized (t,s) nets like PMJ02, and should be avoided
with these kinds of sequences.
Addressed with the following changes:
* Replace the sample pattern generation code with a much simpler algorithm
recently published in the paper "Stochastic Generation of (t, s) Sample
Sequences". This new implementation is easier to verify, produces fully
progressively stratified PMJ02, and is *far* faster than the previous code,
being O(N) in the number of samples generated.
* It keeps the sample index shuffling, which works correctly now due to the
improved sample patterns. But it now uses a newer high-quality hash instead
of the original Laine-Karras hash.
* The scrambling distance feature cannot (to my knowledge) be implemented with
any decorrelation strategy other than Cranley-Patterson, so Cranley-Patterson
is still used when that feature is enabled. But it is now disabled otherwise,
since it increases noise.
* In place of Cranley-Patterson, multiple independent patterns are generated
and randomly chosen for different pixels and dimensions as described in the
original PMJ paper. In this patch, the pattern selection is done via
hash-based shuffling to ensure there are no repeats within a single pixel
until all patterns have been used.
The combination of these fixes brings the quality of Cycles' PMJ sampler in
line with the previously submitted Sobol-Burley sampler in D15679. They are
essentially indistinguishable in terms of quality/noise, which is expected
since they are both randomized (0,2) sequences.
Differential Revision: https://developer.blender.org/D15746
Based on the paper "Practical Hash-based Owen Scrambling" by Brent Burley,
2020, Journal of Computer Graphics Techniques.
It is distinct from the existing Sobol sampler in two important ways:
* It is Owen scrambled, which gives it a much better convergence rate in many
situations.
* It uses padding for higher dimensions, rather than using higher Sobol
dimensions directly. In practice this is advantagous because high-dimensional
Sobol sequences have holes in their sampling patterns that don't resolve
until an unreasonable number of samples are taken. (See Burley's paper for
details.)
The pattern reduces noise in some benchmark scenes, however it is also slower,
particularly on the CPU. So for now Progressive Multi-Jittered sampling remains
the default.
Differential Revision: https://developer.blender.org/D15679