Compare commits

..

32 Commits

Author SHA1 Message Date
ec40ef19c7 Create compute command buffer. 2023-02-06 15:59:15 +01:00
6078fe34f9 Fix vulkan validation error. 2023-02-06 15:11:34 +01:00
79a7030da5 Added compute test case with ssbo buffers. 2023-02-06 14:49:11 +01:00
fbd004f570 Read back from ssbo and check if same data is present. 2023-02-06 14:12:58 +01:00
3f3648300d Merge branch 'master' into temp-vulkan-descriptor-sets 2023-02-06 13:42:11 +01:00
4bd3b02984 Python: Suppress BGL deprecation messages after 100 times.
BGL deprecation calls used to be reported on each use. As bgl calls
are typically part of a handler that is triggered at refresh this
could lead to overflow of messages and slowing down systems when
the terminal/console had to be refreshed as well.

This patch only reports the first 100 bgl deprecation calls. This
gives enough feedback to the developer that changes needs to be made
. But still provides good responsiveness to users when they have
such add-on enabled. Only the first frames can have a slowdown.
2023-02-06 13:35:29 +01:00
404ed5a6ea Fix memory allocation by disabling any extension. 2023-02-06 13:31:21 +01:00
7beb487e9a Fix T104353: Crash on opening sculpting template
`t->region` was `NULL`.

It can happen depending on the context.

Caused by rB19b63b932d2b.
2023-02-06 09:21:04 -03:00
9ad3a85f8b Fix Cycles GPU binaries build error after recent changes for Metal 2023-02-06 13:17:57 +01:00
Michael Jones
654e1e901b Cycles: Use local atomics for faster shader sorting (enabled on Metal)
This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high".

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D16909
2023-02-06 11:18:26 +00:00
Michael Jones
46c9f7702a Cycles: Enable MetalRT opt-in for AMD/Navi2 GPUs
Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17043
2023-02-06 11:14:11 +00:00
Michael Jones
be0912a402 Cycles: Prevent use of both AMD and Intel Metal devices at same time
This patch removes the option to select both AMD and Intel GPUs on system that have both. Currently both devices will be selected by default which results in crashes and other poorly understood behaviour. This patch adds precedence for using any discrete AMD GPU over an integrated Intel one. This can be overridden with CYCLES_METAL_FORCE_INTEL.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D17166
2023-02-06 11:13:33 +00:00
Michael Jones
0a3df611e7 Fix T103393: Cycles: Undefine __LIGHT_TREE__ on Metal/AMD to fix perf
This patch fixes T103393 by undefining `__LIGHT_TREE__` on Metal/AMD as it has an unexpected & major impact on performance even when light trees are not in use.

Patch authored by Prakash Kamliya.

Reviewed By: brecht

Maniphest Tasks: T103393

Differential Revision: https://developer.blender.org/D17167
2023-02-06 11:12:34 +00:00
Amelie Fondevilla
6d297c35c8 Fix T104371: GPencil merge down layer duplicates wrong frame
The merge down operator was sometimes copying the wrong frame, which altered the animation.
While merging the layers, it is sometimes needed to duplicate a keyframe,
when the lowest layer does not have a keyframe but the highest layer does.
Instead of duplicating the previous keyframe of the lowest layer, the code
was actually duplicating the active frame of the layer which was the current frame in the timeline.

This patch fixes the issue by setting the previous keyframe of the layer as its active frame before duplication.

Related issue: T104371.

Differential Revision: https://developer.blender.org/D17214
2023-02-06 10:44:17 +01:00
329eeacc66 Cleanup: Cycles: Remove isotropic microfacet closure setup functions
Turns out these are 100% redundant, so get rid of them.
2023-02-06 04:26:36 +01:00
2627635ff3 Cleanup: use nullptr in C++ 2023-02-06 12:50:34 +11:00
d6b6050e5b Cleanup: use function style casts in C++ 2023-02-06 12:35:51 +11:00
731c3efd97 Cleanup: format 2023-02-06 12:32:45 +11:00
9f5c17f4af Cleanup: comments in code 2023-02-06 12:25:04 +11:00
4fcc9f5e7e Cleanup: use back-slash doxygen commands, de-duplicate doc-string 2023-02-06 12:25:04 +11:00
7de1a4d1d8 Fix GHOST/Wayland thread-unsafe timer-manager manipulation
Mutex locks for manipulating GHOST_System::m_timerManager from
GHOST_SystemWayland relied on WAYLAND being the only user of the
timer-manager.

This isn't the case as timers are fired from
`GHOST_System::dispatchEvents`.

Resolve by using a separate timer-manager for wayland key-repeat timers.
2023-02-06 12:25:04 +11:00
d3949a4fdb Fix GHOST/Wayland thread-unsafe key-repeat timer checks
Resolve a thread safety issue reported by valgrind's helgrind checker,
although I wasn't able to redo the error in practice.

NULL check on the key-repeat timer also needs to lock, otherwise it's
possible the timer is set in another thread before the lock is acquired.

Now all key-repeat timer access which may run from a thread
locks the timer mutex before any checks or timer manipulation.
2023-02-06 12:25:04 +11:00
b642dc7bc7 Fix: Incorrect forward-compatible saving of face sets
There were two errors with the function used to convert face sets
to the legacy mesh format for keeping forward compatibility:
- It was moved before `CustomData_blend_write_prepare` so it
  operated on an empty span.
- It modified the mesh when it's only supposed to change the copy
  of the layers written to the file.

Differential Revision: https://developer.blender.org/D17210
2023-02-05 18:09:22 -05:00
501352ef05 Cleanup: Move PBVH files to C++
For continued refactoring of the Mesh data structure. See T103343.
2023-02-05 17:36:47 -05:00
e766dcc333 Fix missing rename of attribute during rebase. 2023-02-03 15:42:40 +01:00
48e4a417a3 Merge branch 'master' into temp-vulkan-descriptor-sets 2023-02-03 15:24:25 +01:00
e17eb27747 Doing more test to see how vma works. 2023-02-03 15:23:20 +01:00
fc834ee79f Added ssbo test. 2023-02-03 13:54:52 +01:00
b83d03677e Use a single descriptor set. 2023-02-02 14:22:29 +01:00
8b079a4888 Initial vk_shader_interface with SSBO support. 2023-02-02 14:21:41 +01:00
a1ce423ae5 Use similar naming convention compared to metal. 2023-02-02 14:21:41 +01:00
2474810aa0 Vulkan: Added initial compute pipeline. 2023-02-02 14:21:41 +01:00
147 changed files with 1987 additions and 6599 deletions

View File

@@ -1722,13 +1722,20 @@ class CyclesPreferences(bpy.types.AddonPreferences):
row.prop(self, "peer_memory")
if compute_device_type == 'METAL':
import platform
# MetalRT only works on Apple Silicon at present, pending argument encoding fixes on AMD
# Kernel specialization is only viable on Apple Silicon at present due to relative compilation speed
if platform.machine() == 'arm64':
import platform, re
isNavi2 = False
for device in devices:
obj = re.search("((RX)|(Pro)|(PRO))\s+W?6\d00X",device.name)
if obj:
isNavi2 = True
# MetalRT only works on Apple Silicon and Navi2
if platform.machine() == 'arm64' or isNavi2:
col = layout.column()
col.use_property_split = True
col.prop(self, "kernel_optimization_level")
# Kernel specialization is only supported on Apple Silicon
if platform.machine() == 'arm64':
col.prop(self, "kernel_optimization_level")
col.prop(self, "use_metalrt")
def draw(self, context):

View File

@@ -73,6 +73,10 @@ const char *device_kernel_as_string(DeviceKernel kernel)
return "integrator_terminated_paths_array";
case DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY:
return "integrator_sorted_paths_array";
case DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS:
return "integrator_sort_bucket_pass";
case DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS:
return "integrator_sort_write_pass";
case DEVICE_KERNEL_INTEGRATOR_COMPACT_PATHS_ARRAY:
return "integrator_compact_paths_array";
case DEVICE_KERNEL_INTEGRATOR_COMPACT_STATES:

View File

@@ -55,6 +55,10 @@ void device_metal_info(vector<DeviceInfo> &devices)
info.denoisers = DENOISER_NONE;
info.id = id;
if (MetalInfo::get_device_vendor(device) == METAL_GPU_AMD) {
info.has_light_tree = false;
}
devices.push_back(info);
device_index++;
}

View File

@@ -105,6 +105,8 @@ class MetalDevice : public Device {
bool use_adaptive_compilation();
bool use_local_atomic_sort() const;
bool make_source_and_check_if_compile_needed(MetalPipelineType pso_type);
void make_source(MetalPipelineType pso_type, const uint kernel_features);

View File

@@ -271,6 +271,11 @@ bool MetalDevice::use_adaptive_compilation()
return DebugFlags().metal.adaptive_compile;
}
bool MetalDevice::use_local_atomic_sort() const
{
return DebugFlags().metal.use_local_atomic_sort;
}
void MetalDevice::make_source(MetalPipelineType pso_type, const uint kernel_features)
{
string global_defines;
@@ -278,6 +283,10 @@ void MetalDevice::make_source(MetalPipelineType pso_type, const uint kernel_feat
global_defines += "#define __KERNEL_FEATURES__ " + to_string(kernel_features) + "\n";
}
if (use_local_atomic_sort()) {
global_defines += "#define __KERNEL_LOCAL_ATOMIC_SORT__\n";
}
if (use_metalrt) {
global_defines += "#define __METALRT__\n";
if (motion_blur) {

View File

@@ -87,6 +87,9 @@ struct ShaderCache {
break;
}
}
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS] = {1024, 1024};
occupancy_tuning[DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS] = {1024, 1024};
}
~ShaderCache();

View File

@@ -25,6 +25,7 @@ class MetalDeviceQueue : public DeviceQueue {
virtual int num_concurrent_states(const size_t) const override;
virtual int num_concurrent_busy_states(const size_t) const override;
virtual int num_sort_partition_elements() const override;
virtual bool supports_local_atomic_sort() const override;
virtual void init_execution() override;

View File

@@ -315,6 +315,11 @@ int MetalDeviceQueue::num_sort_partition_elements() const
return MetalInfo::optimal_sort_partition_elements(metal_device_->mtlDevice);
}
bool MetalDeviceQueue::supports_local_atomic_sort() const
{
return metal_device_->use_local_atomic_sort();
}
void MetalDeviceQueue::init_execution()
{
/* Synchronize all textures and memory copies before executing task. */
@@ -553,13 +558,24 @@ bool MetalDeviceQueue::enqueue(DeviceKernel kernel,
/* See parallel_active_index.h for why this amount of shared memory is needed.
* Rounded up to 16 bytes for Metal */
shared_mem_bytes = (int)round_up((num_threads_per_block + 1) * sizeof(int), 16);
[mtlComputeCommandEncoder setThreadgroupMemoryLength:shared_mem_bytes atIndex:0];
break;
case DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS:
case DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS: {
int key_count = metal_device_->launch_params.data.max_shaders;
shared_mem_bytes = (int)round_up(key_count * sizeof(int), 16);
break;
}
default:
break;
}
if (shared_mem_bytes) {
assert(shared_mem_bytes <= 32 * 1024);
[mtlComputeCommandEncoder setThreadgroupMemoryLength:shared_mem_bytes atIndex:0];
}
MTLSize size_threadgroups_per_dispatch = MTLSizeMake(
divide_up(work_size, num_threads_per_block), 1, 1);
MTLSize size_threads_per_threadgroup = MTLSizeMake(num_threads_per_block, 1, 1);

View File

@@ -64,6 +64,12 @@ MetalGPUVendor MetalInfo::get_device_vendor(id<MTLDevice> device)
return METAL_GPU_INTEL;
}
else if (strstr(device_name, "AMD")) {
/* Setting this env var hides AMD devices thus exposing any integrated Intel devices. */
if (auto str = getenv("CYCLES_METAL_FORCE_INTEL")) {
if (atoi(str)) {
return METAL_GPU_UNKNOWN;
}
}
return METAL_GPU_AMD;
}
else if (strstr(device_name, "Apple")) {
@@ -96,6 +102,15 @@ vector<id<MTLDevice>> const &MetalInfo::get_usable_devices()
return usable_devices;
}
/* If the system has both an AMD GPU (discrete) and an Intel one (integrated), prefer the AMD
* one. This can be overriden with CYCLES_METAL_FORCE_INTEL. */
bool has_usable_amd_gpu = false;
if (@available(macos 12.3, *)) {
for (id<MTLDevice> device in MTLCopyAllDevices()) {
has_usable_amd_gpu |= (get_device_vendor(device) == METAL_GPU_AMD);
}
}
metal_printf("Usable Metal devices:\n");
for (id<MTLDevice> device in MTLCopyAllDevices()) {
string device_name = get_device_name(device);
@@ -111,8 +126,10 @@ vector<id<MTLDevice>> const &MetalInfo::get_usable_devices()
}
# if defined(MAC_OS_VERSION_13_0)
if (@available(macos 13.0, *)) {
usable |= (vendor == METAL_GPU_INTEL);
if (!has_usable_amd_gpu) {
if (@available(macos 13.0, *)) {
usable |= (vendor == METAL_GPU_INTEL);
}
}
# endif

View File

@@ -112,6 +112,13 @@ class DeviceQueue {
return 65536;
}
/* Does device support local atomic sorting kernels (INTEGRATOR_SORT_BUCKET_PASS and
* INTEGRATOR_SORT_WRITE_PASS)? */
virtual bool supports_local_atomic_sort() const
{
return false;
}
/* Initialize execution of kernels on this queue.
*
* Will, for example, load all data required by the kernels from Device to global or path state.

View File

@@ -71,6 +71,8 @@ PathTraceWorkGPU::PathTraceWorkGPU(Device *device,
device, "integrator_shader_mnee_sort_counter", MEM_READ_WRITE),
integrator_shader_sort_prefix_sum_(
device, "integrator_shader_sort_prefix_sum", MEM_READ_WRITE),
integrator_shader_sort_partition_key_offsets_(
device, "integrator_shader_sort_partition_key_offsets", MEM_READ_WRITE),
integrator_next_main_path_index_(device, "integrator_next_main_path_index", MEM_READ_WRITE),
integrator_next_shadow_path_index_(
device, "integrator_next_shadow_path_index", MEM_READ_WRITE),
@@ -207,33 +209,45 @@ void PathTraceWorkGPU::alloc_integrator_sorting()
integrator_state_gpu_.sort_partition_divisor = (int)divide_up(max_num_paths_,
num_sort_partitions_);
/* Allocate arrays for shader sorting. */
const int sort_buckets = device_scene_->data.max_shaders * num_sort_partitions_;
if (integrator_shader_sort_counter_.size() < sort_buckets) {
integrator_shader_sort_counter_.alloc(sort_buckets);
integrator_shader_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE] =
(int *)integrator_shader_sort_counter_.device_pointer;
integrator_shader_sort_prefix_sum_.alloc(sort_buckets);
integrator_shader_sort_prefix_sum_.zero_to_device();
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) {
if (integrator_shader_raytrace_sort_counter_.size() < sort_buckets) {
integrator_shader_raytrace_sort_counter_.alloc(sort_buckets);
integrator_shader_raytrace_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE] =
(int *)integrator_shader_raytrace_sort_counter_.device_pointer;
if (num_sort_partitions_ > 1 && queue_->supports_local_atomic_sort()) {
/* Allocate array for partitioned shader sorting using local atomics. */
const int num_offsets = (device_scene_->data.max_shaders + 1) * num_sort_partitions_;
if (integrator_shader_sort_partition_key_offsets_.size() < num_offsets) {
integrator_shader_sort_partition_key_offsets_.alloc(num_offsets);
integrator_shader_sort_partition_key_offsets_.zero_to_device();
}
integrator_state_gpu_.sort_partition_key_offsets =
(int *)integrator_shader_sort_partition_key_offsets_.device_pointer;
}
else {
/* Allocate arrays for shader sorting. */
const int sort_buckets = device_scene_->data.max_shaders * num_sort_partitions_;
if (integrator_shader_sort_counter_.size() < sort_buckets) {
integrator_shader_sort_counter_.alloc(sort_buckets);
integrator_shader_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE] =
(int *)integrator_shader_sort_counter_.device_pointer;
if (device_scene_->data.kernel_features & KERNEL_FEATURE_MNEE) {
if (integrator_shader_mnee_sort_counter_.size() < sort_buckets) {
integrator_shader_mnee_sort_counter_.alloc(sort_buckets);
integrator_shader_mnee_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE] =
(int *)integrator_shader_mnee_sort_counter_.device_pointer;
integrator_shader_sort_prefix_sum_.alloc(sort_buckets);
integrator_shader_sort_prefix_sum_.zero_to_device();
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_NODE_RAYTRACE) {
if (integrator_shader_raytrace_sort_counter_.size() < sort_buckets) {
integrator_shader_raytrace_sort_counter_.alloc(sort_buckets);
integrator_shader_raytrace_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_RAYTRACE] =
(int *)integrator_shader_raytrace_sort_counter_.device_pointer;
}
}
if (device_scene_->data.kernel_features & KERNEL_FEATURE_MNEE) {
if (integrator_shader_mnee_sort_counter_.size() < sort_buckets) {
integrator_shader_mnee_sort_counter_.alloc(sort_buckets);
integrator_shader_mnee_sort_counter_.zero_to_device();
integrator_state_gpu_.sort_key_counter[DEVICE_KERNEL_INTEGRATOR_SHADE_SURFACE_MNEE] =
(int *)integrator_shader_mnee_sort_counter_.device_pointer;
}
}
}
}
@@ -451,8 +465,7 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
work_size = num_queued;
d_path_index = queued_paths_.device_pointer;
compute_sorted_queued_paths(
DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY, kernel, num_paths_limit);
compute_sorted_queued_paths(kernel, num_paths_limit);
}
else if (num_queued < work_size) {
work_size = num_queued;
@@ -511,11 +524,26 @@ void PathTraceWorkGPU::enqueue_path_iteration(DeviceKernel kernel, const int num
}
}
void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel kernel,
DeviceKernel queued_kernel,
void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel queued_kernel,
const int num_paths_limit)
{
int d_queued_kernel = queued_kernel;
/* Launch kernel to fill the active paths arrays. */
if (num_sort_partitions_ > 1 && queue_->supports_local_atomic_sort()) {
const int work_size = kernel_max_active_main_path_index(queued_kernel);
device_ptr d_queued_paths = queued_paths_.device_pointer;
int partition_size = (int)integrator_state_gpu_.sort_partition_divisor;
DeviceKernelArguments args(
&work_size, &partition_size, &num_paths_limit, &d_queued_paths, &d_queued_kernel);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS, 1024 * num_sort_partitions_, args);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS, 1024 * num_sort_partitions_, args);
return;
}
device_ptr d_counter = (device_ptr)integrator_state_gpu_.sort_key_counter[d_queued_kernel];
device_ptr d_prefix_sum = integrator_shader_sort_prefix_sum_.device_pointer;
assert(d_counter != 0 && d_prefix_sum != 0);
@@ -552,7 +580,7 @@ void PathTraceWorkGPU::compute_sorted_queued_paths(DeviceKernel kernel,
&d_prefix_sum,
&d_queued_kernel);
queue_->enqueue(kernel, work_size, args);
queue_->enqueue(DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY, work_size, args);
}
}

View File

@@ -70,9 +70,7 @@ class PathTraceWorkGPU : public PathTraceWork {
void enqueue_path_iteration(DeviceKernel kernel, const int num_paths_limit = INT_MAX);
void compute_queued_paths(DeviceKernel kernel, DeviceKernel queued_kernel);
void compute_sorted_queued_paths(DeviceKernel kernel,
DeviceKernel queued_kernel,
const int num_paths_limit);
void compute_sorted_queued_paths(DeviceKernel queued_kernel, const int num_paths_limit);
void compact_main_paths(const int num_active_paths);
void compact_shadow_paths();
@@ -135,6 +133,7 @@ class PathTraceWorkGPU : public PathTraceWork {
device_vector<int> integrator_shader_raytrace_sort_counter_;
device_vector<int> integrator_shader_mnee_sort_counter_;
device_vector<int> integrator_shader_sort_prefix_sum_;
device_vector<int> integrator_shader_sort_partition_key_offsets_;
/* Path split. */
device_vector<int> integrator_next_main_path_index_;
device_vector<int> integrator_next_shadow_path_index_;

View File

@@ -661,7 +661,8 @@ ccl_device void bsdf_blur(KernelGlobals kg, ccl_private ShaderClosure *sc, float
#endif
}
ccl_device_inline Spectrum bsdf_albedo(ccl_private const ShaderData *sd, ccl_private const ShaderClosure *sc)
ccl_device_inline Spectrum bsdf_albedo(ccl_private const ShaderData *sd,
ccl_private const ShaderClosure *sc)
{
Spectrum albedo = sc->weight;
/* Some closures include additional components such as Fresnel terms that cause their albedo to

View File

@@ -519,14 +519,6 @@ ccl_device int bsdf_microfacet_ggx_setup(ccl_private MicrofacetBsdf *bsdf)
return SD_BSDF | SD_BSDF_HAS_EVAL;
}
/* Required to maintain OSL interface. */
ccl_device int bsdf_microfacet_ggx_isotropic_setup(ccl_private MicrofacetBsdf *bsdf)
{
bsdf->alpha_y = bsdf->alpha_x;
return bsdf_microfacet_ggx_setup(bsdf);
}
ccl_device int bsdf_microfacet_ggx_fresnel_setup(ccl_private MicrofacetBsdf *bsdf,
ccl_private const ShaderData *sd)
{
@@ -613,14 +605,6 @@ ccl_device int bsdf_microfacet_beckmann_setup(ccl_private MicrofacetBsdf *bsdf)
return SD_BSDF | SD_BSDF_HAS_EVAL;
}
/* Required to maintain OSL interface. */
ccl_device int bsdf_microfacet_beckmann_isotropic_setup(ccl_private MicrofacetBsdf *bsdf)
{
bsdf->alpha_y = bsdf->alpha_x;
return bsdf_microfacet_beckmann_setup(bsdf);
}
ccl_device int bsdf_microfacet_beckmann_refraction_setup(ccl_private MicrofacetBsdf *bsdf)
{
bsdf->alpha_x = saturatef(bsdf->alpha_x);

View File

@@ -90,8 +90,10 @@ ccl_device float schlick_fresnel(float u)
}
/* Calculate the fresnel color, which is a blend between white and the F0 color */
ccl_device_forceinline Spectrum
interpolate_fresnel_color(float3 L, float3 H, float ior, Spectrum F0)
ccl_device_forceinline Spectrum interpolate_fresnel_color(float3 L,
float3 H,
float ior,
Spectrum F0)
{
/* Compute the real Fresnel term and remap it from real_F0..1 to F0..1.
* The reason why we use this remapping instead of directly doing the

View File

@@ -401,6 +401,71 @@ ccl_gpu_kernel_threads(GPU_PARALLEL_SORTED_INDEX_DEFAULT_BLOCK_SIZE)
}
ccl_gpu_kernel_postfix
ccl_gpu_kernel_threads(GPU_PARALLEL_SORT_BLOCK_SIZE)
ccl_gpu_kernel_signature(integrator_sort_bucket_pass,
int num_states,
int partition_size,
int num_states_limit,
ccl_global int *indices,
int kernel_index)
{
#if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
int max_shaders = context.launch_params_metal.data.max_shaders;
ccl_global ushort *d_queued_kernel = (ccl_global ushort *)
kernel_integrator_state.path.queued_kernel;
ccl_global uint *d_shader_sort_key = (ccl_global uint *)
kernel_integrator_state.path.shader_sort_key;
ccl_global int *key_offsets = (ccl_global int *)
kernel_integrator_state.sort_partition_key_offsets;
gpu_parallel_sort_bucket_pass(num_states,
partition_size,
max_shaders,
kernel_index,
d_queued_kernel,
d_shader_sort_key,
key_offsets,
(threadgroup int *)threadgroup_array,
metal_local_id,
metal_local_size,
metal_grid_id);
#endif
}
ccl_gpu_kernel_threads(GPU_PARALLEL_SORT_BLOCK_SIZE)
ccl_gpu_kernel_signature(integrator_sort_write_pass,
int num_states,
int partition_size,
int num_states_limit,
ccl_global int *indices,
int kernel_index)
{
#if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
int max_shaders = context.launch_params_metal.data.max_shaders;
ccl_global ushort *d_queued_kernel = (ccl_global ushort *)
kernel_integrator_state.path.queued_kernel;
ccl_global uint *d_shader_sort_key = (ccl_global uint *)
kernel_integrator_state.path.shader_sort_key;
ccl_global int *key_offsets = (ccl_global int *)
kernel_integrator_state.sort_partition_key_offsets;
gpu_parallel_sort_write_pass(num_states,
partition_size,
max_shaders,
kernel_index,
num_states_limit,
indices,
d_queued_kernel,
d_shader_sort_key,
key_offsets,
(threadgroup int *)threadgroup_array,
metal_local_id,
metal_local_size,
metal_grid_id);
#endif
}
ccl_gpu_kernel_postfix
ccl_gpu_kernel_threads(GPU_PARALLEL_ACTIVE_INDEX_DEFAULT_BLOCK_SIZE)
ccl_gpu_kernel_signature(integrator_compact_paths_array,
int num_states,

View File

@@ -178,7 +178,7 @@ __device__
simd_lane_index, \
simd_group_index, \
num_simd_groups, \
simdgroup_offset)
(threadgroup int *)threadgroup_array)
#elif defined(__KERNEL_ONEAPI__)
# define gpu_parallel_active_index_array(num_states, indices, num_indices, is_active_op) \

View File

@@ -19,6 +19,115 @@ CCL_NAMESPACE_BEGIN
# define GPU_PARALLEL_SORTED_INDEX_DEFAULT_BLOCK_SIZE 512
#endif
#define GPU_PARALLEL_SORTED_INDEX_INACTIVE_KEY (~0)
#define GPU_PARALLEL_SORT_BLOCK_SIZE 1024
#if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
# define atomic_store_local(p, x) \
atomic_store_explicit((threadgroup atomic_int *)p, x, memory_order_relaxed)
# define atomic_load_local(p) \
atomic_load_explicit((threadgroup atomic_int *)p, memory_order_relaxed)
ccl_device_inline void gpu_parallel_sort_bucket_pass(const uint num_states,
const uint partition_size,
const uint max_shaders,
const uint queued_kernel,
ccl_global ushort *d_queued_kernel,
ccl_global uint *d_shader_sort_key,
ccl_global int *partition_key_offsets,
ccl_gpu_shared int *buckets,
const ushort local_id,
const ushort local_size,
const ushort grid_id)
{
/* Zero the bucket sizes. */
if (local_id < max_shaders) {
atomic_store_local(&buckets[local_id], 0);
}
ccl_gpu_syncthreads();
/* Determine bucket sizes within the partitions. */
const uint partition_start = partition_size * uint(grid_id);
const uint partition_end = min(num_states, partition_start + partition_size);
for (int state_index = partition_start + uint(local_id); state_index < partition_end;
state_index += uint(local_size)) {
ushort kernel_index = d_queued_kernel[state_index];
if (kernel_index == queued_kernel) {
uint key = d_shader_sort_key[state_index] % max_shaders;
atomic_fetch_and_add_uint32(&buckets[key], 1);
}
}
ccl_gpu_syncthreads();
/* Calculate the partition's local offsets from the prefix sum of bucket sizes. */
if (local_id == 0) {
int offset = 0;
for (int i = 0; i < max_shaders; i++) {
partition_key_offsets[i + uint(grid_id) * (max_shaders + 1)] = offset;
offset = offset + atomic_load_local(&buckets[i]);
}
/* Store the number of active states in this partition. */
partition_key_offsets[max_shaders + uint(grid_id) * (max_shaders + 1)] = offset;
}
}
ccl_device_inline void gpu_parallel_sort_write_pass(const uint num_states,
const uint partition_size,
const uint max_shaders,
const uint queued_kernel,
const int num_states_limit,
ccl_global int *indices,
ccl_global ushort *d_queued_kernel,
ccl_global uint *d_shader_sort_key,
ccl_global int *partition_key_offsets,
ccl_gpu_shared int *local_offset,
const ushort local_id,
const ushort local_size,
const ushort grid_id)
{
/* Calculate each partition's global offset from the prefix sum of the active state counts per
* partition. */
if (local_id < max_shaders) {
int partition_offset = 0;
for (int i = 0; i < uint(grid_id); i++) {
int partition_key_count = partition_key_offsets[max_shaders + uint(i) * (max_shaders + 1)];
partition_offset += partition_key_count;
}
ccl_global int *key_offsets = partition_key_offsets + (uint(grid_id) * (max_shaders + 1));
atomic_store_local(&local_offset[local_id], key_offsets[local_id] + partition_offset);
}
ccl_gpu_syncthreads();
/* Write the sorted active indices. */
const uint partition_start = partition_size * uint(grid_id);
const uint partition_end = min(num_states, partition_start + partition_size);
ccl_global int *key_offsets = partition_key_offsets + (uint(grid_id) * max_shaders);
for (int state_index = partition_start + uint(local_id); state_index < partition_end;
state_index += uint(local_size)) {
ushort kernel_index = d_queued_kernel[state_index];
if (kernel_index == queued_kernel) {
uint key = d_shader_sort_key[state_index] % max_shaders;
int index = atomic_fetch_and_add_uint32(&local_offset[key], 1);
if (index < num_states_limit) {
indices[index] = state_index;
}
}
}
}
#endif /* __KERNEL_LOCAL_ATOMIC_SORT__ */
template<typename GetKeyOp>
__device__ void gpu_parallel_sorted_index_array(const uint state_index,

View File

@@ -105,10 +105,11 @@ struct kernel_gpu_##name \
{ \
PARAMS_MAKER(__VA_ARGS__)(__VA_ARGS__) \
void run(thread MetalKernelContext& context, \
threadgroup int *simdgroup_offset, \
threadgroup atomic_int *threadgroup_array, \
const uint metal_global_id, \
const ushort metal_local_id, \
const ushort metal_local_size, \
const ushort metal_grid_id, \
uint simdgroup_size, \
uint simd_lane_index, \
uint simd_group_index, \
@@ -117,22 +118,24 @@ struct kernel_gpu_##name \
kernel void cycles_metal_##name(device const kernel_gpu_##name *params_struct, \
constant KernelParamsMetal &ccl_restrict _launch_params_metal, \
constant MetalAncillaries *_metal_ancillaries, \
threadgroup int *simdgroup_offset[[ threadgroup(0) ]], \
threadgroup atomic_int *threadgroup_array[[ threadgroup(0) ]], \
const uint metal_global_id [[thread_position_in_grid]], \
const ushort metal_local_id [[thread_position_in_threadgroup]], \
const ushort metal_local_size [[threads_per_threadgroup]], \
const ushort metal_grid_id [[threadgroup_position_in_grid]], \
uint simdgroup_size [[threads_per_simdgroup]], \
uint simd_lane_index [[thread_index_in_simdgroup]], \
uint simd_group_index [[simdgroup_index_in_threadgroup]], \
uint num_simd_groups [[simdgroups_per_threadgroup]]) { \
MetalKernelContext context(_launch_params_metal, _metal_ancillaries); \
params_struct->run(context, simdgroup_offset, metal_global_id, metal_local_id, metal_local_size, simdgroup_size, simd_lane_index, simd_group_index, num_simd_groups); \
params_struct->run(context, threadgroup_array, metal_global_id, metal_local_id, metal_local_size, metal_grid_id, simdgroup_size, simd_lane_index, simd_group_index, num_simd_groups); \
} \
void kernel_gpu_##name::run(thread MetalKernelContext& context, \
threadgroup int *simdgroup_offset, \
threadgroup atomic_int *threadgroup_array, \
const uint metal_global_id, \
const ushort metal_local_id, \
const ushort metal_local_size, \
const ushort metal_grid_id, \
uint simdgroup_size, \
uint simd_lane_index, \
uint simd_group_index, \

View File

@@ -132,6 +132,9 @@ typedef struct IntegratorStateGPU {
/* Index of main path which will be used by a next shadow catcher split. */
ccl_global int *next_main_path_index;
/* Partition/key offsets used when writing sorted active indices. */
ccl_global int *sort_partition_key_offsets;
/* Divisor used to partition active indices by locality when sorting by material. */
uint sort_partition_divisor;
} IntegratorStateGPU;

View File

@@ -115,6 +115,13 @@ ccl_device_forceinline void integrator_path_init_sorted(KernelGlobals kg,
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
INTEGRATOR_STATE_WRITE(state, path, shader_sort_key) = key_;
# if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
if (!kernel_integrator_state.sort_key_counter[next_kernel]) {
return;
}
# endif
atomic_fetch_and_add_uint32(&kernel_integrator_state.sort_key_counter[next_kernel][key_], 1);
}
@@ -130,6 +137,13 @@ ccl_device_forceinline void integrator_path_next_sorted(KernelGlobals kg,
atomic_fetch_and_add_uint32(&kernel_integrator_state.queue_counter->num_queued[next_kernel], 1);
INTEGRATOR_STATE_WRITE(state, path, queued_kernel) = next_kernel;
INTEGRATOR_STATE_WRITE(state, path, shader_sort_key) = key_;
# if defined(__KERNEL_LOCAL_ATOMIC_SORT__)
if (!kernel_integrator_state.sort_key_counter[next_kernel]) {
return;
}
# endif
atomic_fetch_and_add_uint32(&kernel_integrator_state.sort_key_counter[next_kernel][key_], 1);
}

View File

@@ -209,14 +209,7 @@ ccl_device void osl_closure_microfacet_setup(KernelGlobals kg,
if (closure->distribution == make_string("ggx", 11253504724482777663ull) ||
closure->distribution == make_string("default", 4430693559278735917ull)) {
if (!closure->refract) {
if (closure->alpha_x == closure->alpha_y) {
/* Isotropic */
sd->flag |= bsdf_microfacet_ggx_isotropic_setup(bsdf);
}
else {
/* Anisotropic */
sd->flag |= bsdf_microfacet_ggx_setup(bsdf);
}
sd->flag |= bsdf_microfacet_ggx_setup(bsdf);
}
else {
sd->flag |= bsdf_microfacet_ggx_refraction_setup(bsdf);
@@ -225,14 +218,7 @@ ccl_device void osl_closure_microfacet_setup(KernelGlobals kg,
/* Beckmann */
else {
if (!closure->refract) {
if (closure->alpha_x == closure->alpha_y) {
/* Isotropic */
sd->flag |= bsdf_microfacet_beckmann_isotropic_setup(bsdf);
}
else {
/* Anisotropic */
sd->flag |= bsdf_microfacet_beckmann_setup(bsdf);
}
sd->flag |= bsdf_microfacet_beckmann_setup(bsdf);
}
else {
sd->flag |= bsdf_microfacet_beckmann_refraction_setup(bsdf);
@@ -258,9 +244,9 @@ ccl_device void osl_closure_microfacet_ggx_setup(
}
bsdf->N = ensure_valid_reflection(sd->Ng, sd->wi, closure->N);
bsdf->alpha_x = closure->alpha_x;
bsdf->alpha_x = bsdf->alpha_y = closure->alpha_x;
sd->flag |= bsdf_microfacet_ggx_isotropic_setup(bsdf);
sd->flag |= bsdf_microfacet_ggx_setup(bsdf);
}
ccl_device void osl_closure_microfacet_ggx_aniso_setup(
@@ -652,9 +638,9 @@ ccl_device void osl_closure_microfacet_beckmann_setup(
}
bsdf->N = ensure_valid_reflection(sd->Ng, sd->wi, closure->N);
bsdf->alpha_x = closure->alpha_x;
bsdf->alpha_x = bsdf->alpha_y = closure->alpha_x;
sd->flag |= bsdf_microfacet_beckmann_isotropic_setup(bsdf);
sd->flag |= bsdf_microfacet_beckmann_setup(bsdf);
}
ccl_device void osl_closure_microfacet_beckmann_aniso_setup(

View File

@@ -74,7 +74,8 @@ CCL_NAMESPACE_BEGIN
#define __VOLUME__
/* TODO: solve internal compiler errors and enable light tree on HIP. */
#ifdef __KERNEL_HIP__
/* TODO: solve internal compiler perf issue and enable light tree on Metal/AMD. */
#if defined(__KERNEL_HIP__) || defined(__KERNEL_METAL_AMD__)
# undef __LIGHT_TREE__
#endif
@@ -1508,6 +1509,8 @@ typedef enum DeviceKernel : int {
DEVICE_KERNEL_INTEGRATOR_ACTIVE_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_TERMINATED_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_SORTED_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_SORT_BUCKET_PASS,
DEVICE_KERNEL_INTEGRATOR_SORT_WRITE_PASS,
DEVICE_KERNEL_INTEGRATOR_COMPACT_PATHS_ARRAY,
DEVICE_KERNEL_INTEGRATOR_COMPACT_STATES,
DEVICE_KERNEL_INTEGRATOR_TERMINATED_SHADOW_PATHS_ARRAY,

View File

@@ -73,16 +73,55 @@ ccl_device_inline float atomic_add_and_fetch_float(volatile ccl_global float *_s
return new_value.float_value;
}
# define atomic_fetch_and_add_uint32(p, x) \
atomic_fetch_add_explicit((device atomic_uint *)p, x, memory_order_relaxed)
# define atomic_fetch_and_sub_uint32(p, x) \
atomic_fetch_sub_explicit((device atomic_uint *)p, x, memory_order_relaxed)
# define atomic_fetch_and_inc_uint32(p) \
atomic_fetch_add_explicit((device atomic_uint *)p, 1, memory_order_relaxed)
# define atomic_fetch_and_dec_uint32(p) \
atomic_fetch_sub_explicit((device atomic_uint *)p, 1, memory_order_relaxed)
# define atomic_fetch_and_or_uint32(p, x) \
atomic_fetch_or_explicit((device atomic_uint *)p, x, memory_order_relaxed)
template<class T> ccl_device_inline uint32_t atomic_fetch_and_add_uint32(device T *p, int x)
{
return atomic_fetch_add_explicit((device atomic_uint *)p, x, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_sub_uint32(device T *p, int x)
{
return atomic_fetch_sub_explicit((device atomic_uint *)p, x, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_inc_uint32(device T *p)
{
return atomic_fetch_add_explicit((device atomic_uint *)p, 1, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_dec_uint32(device T *p)
{
return atomic_fetch_sub_explicit((device atomic_uint *)p, 1, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_or_uint32(device T *p, int x)
{
return atomic_fetch_or_explicit((device atomic_uint *)p, x, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_add_uint32(threadgroup T *p, int x)
{
return atomic_fetch_add_explicit((threadgroup atomic_uint *)p, x, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_sub_uint32(threadgroup T *p, int x)
{
return atomic_fetch_sub_explicit((threadgroup atomic_uint *)p, x, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_inc_uint32(threadgroup T *p)
{
return atomic_fetch_add_explicit((threadgroup atomic_uint *)p, 1, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_dec_uint32(threadgroup T *p)
{
return atomic_fetch_sub_explicit((threadgroup atomic_uint *)p, 1, memory_order_relaxed);
}
template<class T> ccl_device_inline uint32_t atomic_fetch_and_or_uint32(threadgroup T *p, int x)
{
return atomic_fetch_or_explicit((threadgroup atomic_uint *)p, x, memory_order_relaxed);
}
ccl_device_inline float atomic_compare_and_swap_float(volatile ccl_global float *dest,
const float old_val,

View File

@@ -69,6 +69,9 @@ void DebugFlags::Metal::reset()
{
if (getenv("CYCLES_METAL_ADAPTIVE_COMPILE") != NULL)
adaptive_compile = true;
if (auto str = getenv("CYCLES_METAL_LOCAL_ATOMIC_SORT"))
use_local_atomic_sort = (atoi(str) != 0);
}
DebugFlags::OptiX::OptiX()

View File

@@ -97,6 +97,9 @@ class DebugFlags {
/* Whether adaptive feature based runtime compile is enabled or not. */
bool adaptive_compile = false;
/* Whether local atomic sorting is enabled or not. */
bool use_local_atomic_sort = true;
};
/* Get instance of debug flags registry. */

View File

@@ -1201,6 +1201,7 @@ void GHOST_GetVulkanHandles(GHOST_ContextHandle context,
void *r_instance,
void *r_physical_device,
void *r_device,
void *r_compute_command_buffer,
uint32_t *r_graphic_queue_family);
/**

View File

@@ -40,7 +40,7 @@ class GHOST_IContext {
virtual unsigned int getDefaultFramebuffer() = 0;
virtual GHOST_TSuccess getVulkanHandles(void *, void *, void *, uint32_t *) = 0;
virtual GHOST_TSuccess getVulkanHandles(void *, void *, void *, void *, uint32_t *) = 0;
/**
* Gets the Vulkan framebuffer related resource handles associated with the Vulkan context.

View File

@@ -1203,10 +1203,12 @@ void GHOST_GetVulkanHandles(GHOST_ContextHandle contexthandle,
void *r_instance,
void *r_physical_device,
void *r_device,
void *r_compute_command_buffer,
uint32_t *r_graphic_queue_family)
{
GHOST_IContext *context = (GHOST_IContext *)contexthandle;
context->getVulkanHandles(r_instance, r_physical_device, r_device, r_graphic_queue_family);
context->getVulkanHandles(
r_instance, r_physical_device, r_device, r_compute_command_buffer, r_graphic_queue_family);
}
void GHOST_GetVulkanBackbuffer(GHOST_WindowHandle windowhandle,

View File

@@ -142,6 +142,7 @@ class GHOST_Context : public GHOST_IContext {
virtual GHOST_TSuccess getVulkanHandles(void * /*r_instance*/,
void * /*r_physical_device*/,
void * /*r_device*/,
void * /*r_compute_command_buffer*/,
uint32_t * /*r_graphic_queue_family*/) override
{
return GHOST_kFailure;

View File

@@ -192,6 +192,9 @@ GHOST_TSuccess GHOST_ContextVK::destroySwapchain()
if (m_render_pass != VK_NULL_HANDLE) {
vkDestroyRenderPass(m_device, m_render_pass, NULL);
}
if (m_compute_command_buffer != VK_NULL_HANDLE) {
vkFreeCommandBuffers(m_device, m_command_pool, 1, &m_compute_command_buffer);
}
for (auto command_buffer : m_command_buffers) {
vkFreeCommandBuffers(m_device, m_command_pool, 1, &command_buffer);
}
@@ -311,11 +314,13 @@ GHOST_TSuccess GHOST_ContextVK::getVulkanBackbuffer(void *image,
GHOST_TSuccess GHOST_ContextVK::getVulkanHandles(void *r_instance,
void *r_physical_device,
void *r_device,
void *r_compute_command_buffer,
uint32_t *r_graphic_queue_family)
{
*((VkInstance *)r_instance) = m_instance;
*((VkPhysicalDevice *)r_physical_device) = m_physical_device;
*((VkDevice *)r_device) = m_device;
*((VkCommandBuffer *)r_compute_command_buffer) = m_compute_command_buffer;
*r_graphic_queue_family = m_queue_family_graphic;
return GHOST_kSuccess;
@@ -619,16 +624,34 @@ static GHOST_TSuccess selectPresentMode(VkPhysicalDevice device,
return GHOST_kFailure;
}
GHOST_TSuccess GHOST_ContextVK::createCommandBuffers()
GHOST_TSuccess GHOST_ContextVK::createCommandPools()
{
m_command_buffers.resize(m_swapchain_image_views.size());
VkCommandPoolCreateInfo poolInfo = {};
poolInfo.sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO;
poolInfo.flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT;
poolInfo.queueFamilyIndex = m_queue_family_graphic;
VK_CHECK(vkCreateCommandPool(m_device, &poolInfo, NULL, &m_command_pool));
return GHOST_kSuccess;
}
GHOST_TSuccess GHOST_ContextVK::createComputeCommandBuffer()
{
assert(m_command_pool != VK_NULL_HANDLE);
VkCommandBufferAllocateInfo alloc_info = {};
alloc_info.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO;
alloc_info.commandPool = m_command_pool;
alloc_info.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY;
alloc_info.commandBufferCount = 1;
VK_CHECK(vkAllocateCommandBuffers(m_device, &alloc_info, &m_compute_command_buffer));
return GHOST_kSuccess;
}
GHOST_TSuccess GHOST_ContextVK::createGraphicsCommandBuffers()
{
assert(m_command_pool != VK_NULL_HANDLE);
m_command_buffers.resize(m_swapchain_image_views.size());
VkCommandBufferAllocateInfo alloc_info = {};
alloc_info.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO;
@@ -637,7 +660,6 @@ GHOST_TSuccess GHOST_ContextVK::createCommandBuffers()
alloc_info.commandBufferCount = static_cast<uint32_t>(m_command_buffers.size());
VK_CHECK(vkAllocateCommandBuffers(m_device, &alloc_info, m_command_buffers.data()));
return GHOST_kSuccess;
}
@@ -776,7 +798,7 @@ GHOST_TSuccess GHOST_ContextVK::createSwapchain()
VK_CHECK(vkCreateFence(m_device, &fence_info, NULL, &m_in_flight_fences[i]));
}
createCommandBuffers();
createGraphicsCommandBuffers();
return GHOST_kSuccess;
}
@@ -841,6 +863,13 @@ GHOST_TSuccess GHOST_ContextVK::initializeDrawingContext()
extensions_device.push_back(VK_KHR_SWAPCHAIN_EXTENSION_NAME);
}
extensions_device.push_back("VK_KHR_dedicated_allocation");
extensions_device.push_back("VK_KHR_get_memory_requirements2");
/* Enable MoltenVK required instance extensions.*/
#ifdef VK_MVK_MOLTENVK_EXTENSION_NAME
requireExtension(
extensions_available, extensions_enabled, "VK_KHR_get_physical_device_properties2");
#endif
VkApplicationInfo app_info = {};
app_info.sType = VK_STRUCTURE_TYPE_APPLICATION_INFO;
@@ -903,6 +932,15 @@ GHOST_TSuccess GHOST_ContextVK::initializeDrawingContext()
return GHOST_kFailure;
}
#ifdef VK_MVK_MOLTENVK_EXTENSION_NAME
/* According to the Vulkan specs, when `VK_KHR_portability_subset` is available it should be
* enabled. See
* https://vulkan.lunarg.com/doc/view/1.2.198.1/mac/1.2-extensions/vkspec.html#VUID-VkDeviceCreateInfo-pProperties-04451*/
if (device_extensions_support(m_physical_device, {VK_KHR_PORTABILITY_SUBSET_EXTENSION_NAME})) {
extensions_device.push_back(VK_KHR_PORTABILITY_SUBSET_EXTENSION_NAME);
}
#endif
vector<VkDeviceQueueCreateInfo> queue_create_infos;
{
@@ -960,6 +998,9 @@ GHOST_TSuccess GHOST_ContextVK::initializeDrawingContext()
VK_CHECK(vkCreateDevice(m_physical_device, &device_create_info, NULL, &m_device));
createCommandPools();
createComputeCommandBuffer();
vkGetDeviceQueue(m_device, m_queue_family_graphic, 0, &m_graphic_queue);
if (use_window_surface) {

View File

@@ -113,6 +113,7 @@ class GHOST_ContextVK : public GHOST_Context {
GHOST_TSuccess getVulkanHandles(void *r_instance,
void *r_physical_device,
void *r_device,
void *r_compute_command_buffer,
uint32_t *r_graphic_queue_family);
/**
* Gets the Vulkan framebuffer related resource handles associated with the Vulkan context.
@@ -182,6 +183,7 @@ class GHOST_ContextVK : public GHOST_Context {
std::vector<VkImage> m_swapchain_images;
std::vector<VkImageView> m_swapchain_image_views;
std::vector<VkFramebuffer> m_swapchain_framebuffers;
VkCommandBuffer m_compute_command_buffer;
std::vector<VkCommandBuffer> m_command_buffers;
VkRenderPass m_render_pass;
VkExtent2D m_render_extent;
@@ -200,6 +202,8 @@ class GHOST_ContextVK : public GHOST_Context {
GHOST_TSuccess pickPhysicalDevice(std::vector<const char *> required_exts);
GHOST_TSuccess createSwapchain();
GHOST_TSuccess destroySwapchain();
GHOST_TSuccess createCommandBuffers();
GHOST_TSuccess createCommandPools();
GHOST_TSuccess createGraphicsCommandBuffers();
GHOST_TSuccess createComputeCommandBuffer();
GHOST_TSuccess recordCommandBuffers();
};

View File

@@ -82,6 +82,8 @@
#include "CLG_log.h"
#ifdef USE_EVENT_BACKGROUND_THREAD
# include "GHOST_TimerTask.h"
# include <pthread.h>
#endif
@@ -768,7 +770,12 @@ struct GWL_Seat {
int32_t rate = 0;
/** Time (milliseconds) after which to start repeating keys. */
int32_t delay = 0;
/** Timer for key repeats. */
/**
* Timer for key repeats.
*
* \note For as long as #USE_EVENT_BACKGROUND_THREAD is defined, any access to this
* (including null checks, must lock `timer_mutex` first.
*/
GHOST_ITimerTask *timer = nullptr;
} key_repeat;
@@ -832,6 +839,42 @@ static bool gwl_seat_key_depressed_suppress_warning(const GWL_Seat *seat)
return suppress_warning;
}
/**
* \note Caller must lock `timer_mutex`.
*/
static void gwl_seat_key_repeat_timer_add(GWL_Seat *seat,
GHOST_TimerProcPtr key_repeat_fn,
GHOST_TUserDataPtr payload,
const bool use_delay)
{
GHOST_SystemWayland *system = seat->system;
const uint64_t time_step = 1000 / seat->key_repeat.rate;
const uint64_t time_start = use_delay ? seat->key_repeat.delay : time_step;
#ifdef USE_EVENT_BACKGROUND_THREAD
GHOST_TimerTask *timer = new GHOST_TimerTask(
system->getMilliSeconds() + time_start, time_step, key_repeat_fn, payload);
seat->key_repeat.timer = timer;
system->ghost_timer_manager()->addTimer(timer);
#else
seat->key_repeat.timer = system->installTimer(time_start, time_step, key_repeat_fn, payload);
#endif
}
/**
* \note The caller must lock `timer_mutex`.
*/
static void gwl_seat_key_repeat_timer_remove(GWL_Seat *seat)
{
GHOST_SystemWayland *system = seat->system;
#ifdef USE_EVENT_BACKGROUND_THREAD
system->ghost_timer_manager()->removeTimer(
static_cast<GHOST_TimerTask *>(seat->key_repeat.timer));
#else
system->removeTimer(seat->key_repeat.timer);
#endif
seat->key_repeat.timer = nullptr;
}
/** \} */
/* -------------------------------------------------------------------- */
@@ -906,6 +949,16 @@ struct GWL_Display {
/** Guard against multiple threads accessing `events_pending` at once. */
std::mutex events_pending_mutex;
/**
* A separate timer queue, needed so the WAYLAND thread can lock access.
* Using the system's #GHOST_Sysem::getTimerManager is not thread safe because
* access to the timer outside of WAYLAND specific logic will not lock.
*
* Needed because #GHOST_System::dispatchEvents fires timers
* outside of WAYLAND (without locking the `timer_mutex`).
*/
GHOST_TimerManager *ghost_timer_manager;
#endif /* USE_EVENT_BACKGROUND_THREAD */
};
@@ -922,6 +975,9 @@ static void gwl_display_destroy(GWL_Display *display)
ghost_wl_display_lock_without_input(display->wl_display, display->system->server_mutex);
display->events_pthread_is_active = false;
}
delete display->ghost_timer_manager;
display->ghost_timer_manager = nullptr;
#endif
/* For typical WAYLAND use this will always be set.
@@ -3718,9 +3774,14 @@ static void keyboard_handle_leave(void *data,
GWL_Seat *seat = static_cast<GWL_Seat *>(data);
seat->keyboard.wl_surface_window = nullptr;
/* Losing focus must stop repeating text. */
if (seat->key_repeat.timer) {
keyboard_handle_key_repeat_cancel(seat);
{
#ifdef USE_EVENT_BACKGROUND_THREAD
std::lock_guard lock_timer_guard{*seat->system->timer_mutex};
#endif
/* Losing focus must stop repeating text. */
if (seat->key_repeat.timer) {
keyboard_handle_key_repeat_cancel(seat);
}
}
#ifdef USE_GNOME_KEYBOARD_SUPPRESS_WARNING
@@ -3780,36 +3841,32 @@ static xkb_keysym_t xkb_state_key_get_one_sym_without_modifiers(
return sym;
}
/**
* \note Caller must lock `timer_mutex`.
*/
static void keyboard_handle_key_repeat_cancel(GWL_Seat *seat)
{
#ifdef USE_EVENT_BACKGROUND_THREAD
std::lock_guard lock_timer_guard{*seat->system->timer_mutex};
#endif
GHOST_ASSERT(seat->key_repeat.timer != nullptr, "Caller much check for timer");
delete static_cast<GWL_KeyRepeatPlayload *>(seat->key_repeat.timer->getUserData());
seat->system->removeTimer(seat->key_repeat.timer);
seat->key_repeat.timer = nullptr;
gwl_seat_key_repeat_timer_remove(seat);
}
/**
* Restart the key-repeat timer.
* \param use_delay: When false, use the interval
* (prevents pause when the setting changes while the key is held).
*
* \note Caller must lock `timer_mutex`.
*/
static void keyboard_handle_key_repeat_reset(GWL_Seat *seat, const bool use_delay)
{
#ifdef USE_EVENT_BACKGROUND_THREAD
std::lock_guard lock_timer_guard{*seat->system->timer_mutex};
#endif
GHOST_ASSERT(seat->key_repeat.timer != nullptr, "Caller much check for timer");
GHOST_SystemWayland *system = seat->system;
GHOST_ITimerTask *timer = seat->key_repeat.timer;
GHOST_TimerProcPtr key_repeat_fn = timer->getTimerProc();
GHOST_TimerProcPtr key_repeat_fn = seat->key_repeat.timer->getTimerProc();
GHOST_TUserDataPtr payload = seat->key_repeat.timer->getUserData();
seat->system->removeTimer(seat->key_repeat.timer);
const uint64_t time_step = 1000 / seat->key_repeat.rate;
const uint64_t time_start = use_delay ? seat->key_repeat.delay : time_step;
seat->key_repeat.timer = system->installTimer(time_start, time_step, key_repeat_fn, payload);
gwl_seat_key_repeat_timer_remove(seat);
gwl_seat_key_repeat_timer_add(seat, key_repeat_fn, payload, use_delay);
}
static void keyboard_handle_key(void *data,
@@ -3848,6 +3905,11 @@ static void keyboard_handle_key(void *data,
break;
}
#ifdef USE_EVENT_BACKGROUND_THREAD
/* Any access to `seat->key_repeat.timer` must lock. */
std::lock_guard lock_timer_guard{*seat->system->timer_mutex};
#endif
struct GWL_KeyRepeatPlayload *key_repeat_payload = nullptr;
/* Delete previous timer. */
@@ -3886,23 +3948,14 @@ static void keyboard_handle_key(void *data,
break;
}
case RESET: {
#ifdef USE_EVENT_BACKGROUND_THREAD
std::lock_guard lock_timer_guard{*seat->system->timer_mutex};
#endif
/* The payload will be added again. */
seat->system->removeTimer(seat->key_repeat.timer);
seat->key_repeat.timer = nullptr;
gwl_seat_key_repeat_timer_remove(seat);
break;
}
case CANCEL: {
#ifdef USE_EVENT_BACKGROUND_THREAD
std::lock_guard lock_timer_guard{*seat->system->timer_mutex};
#endif
delete key_repeat_payload;
key_repeat_payload = nullptr;
seat->system->removeTimer(seat->key_repeat.timer);
seat->key_repeat.timer = nullptr;
gwl_seat_key_repeat_timer_remove(seat);
break;
}
}
@@ -3956,8 +4009,8 @@ static void keyboard_handle_key(void *data,
utf8_buf));
}
};
seat->key_repeat.timer = seat->system->installTimer(
seat->key_repeat.delay, 1000 / seat->key_repeat.rate, key_repeat_fn, key_repeat_payload);
gwl_seat_key_repeat_timer_add(seat, key_repeat_fn, key_repeat_payload, true);
}
}
@@ -3982,8 +4035,13 @@ static void keyboard_handle_modifiers(void *data,
/* A modifier changed so reset the timer,
* see comment in #keyboard_handle_key regarding this behavior. */
if (seat->key_repeat.timer) {
keyboard_handle_key_repeat_reset(seat, true);
{
#ifdef USE_EVENT_BACKGROUND_THREAD
std::lock_guard lock_timer_guard{*seat->system->timer_mutex};
#endif
if (seat->key_repeat.timer) {
keyboard_handle_key_repeat_reset(seat, true);
}
}
#ifdef USE_GNOME_KEYBOARD_SUPPRESS_WARNING
@@ -4002,9 +4060,14 @@ static void keyboard_repeat_handle_info(void *data,
seat->key_repeat.rate = rate;
seat->key_repeat.delay = delay;
/* Unlikely possible this setting changes while repeating. */
if (seat->key_repeat.timer) {
keyboard_handle_key_repeat_reset(seat, false);
{
#ifdef USE_EVENT_BACKGROUND_THREAD
std::lock_guard lock_timer_guard{*seat->system->timer_mutex};
#endif
/* Unlikely possible this setting changes while repeating. */
if (seat->key_repeat.timer) {
keyboard_handle_key_repeat_reset(seat, false);
}
}
}
@@ -4275,8 +4338,14 @@ static void gwl_seat_capability_keyboard_disable(GWL_Seat *seat)
if (!seat->wl_keyboard) {
return;
}
if (seat->key_repeat.timer) {
keyboard_handle_key_repeat_cancel(seat);
{
#ifdef USE_EVENT_BACKGROUND_THREAD
std::lock_guard lock_timer_guard{*seat->system->timer_mutex};
#endif
if (seat->key_repeat.timer) {
keyboard_handle_key_repeat_cancel(seat);
}
}
wl_keyboard_destroy(seat->wl_keyboard);
seat->wl_keyboard = nullptr;
@@ -5411,6 +5480,8 @@ GHOST_SystemWayland::GHOST_SystemWayland(bool background)
#ifdef USE_EVENT_BACKGROUND_THREAD
gwl_display_event_thread_create(display_);
display_->ghost_timer_manager = new GHOST_TimerManager();
#endif
}
@@ -5491,10 +5562,16 @@ bool GHOST_SystemWayland::processEvents(bool waitForEvent)
#endif /* USE_EVENT_BACKGROUND_THREAD */
{
const uint64_t now = getMilliSeconds();
#ifdef USE_EVENT_BACKGROUND_THREAD
std::lock_guard lock_timer_guard{*display_->system->timer_mutex};
{
std::lock_guard lock_timer_guard{*display_->system->timer_mutex};
if (ghost_timer_manager()->fireTimers(now)) {
any_processed = true;
}
}
#endif
if (getTimerManager()->fireTimers(getMilliSeconds())) {
if (getTimerManager()->fireTimers(now)) {
any_processed = true;
}
}
@@ -6717,6 +6794,13 @@ struct wl_shm *GHOST_SystemWayland::wl_shm() const
return display_->wl_shm;
}
#ifdef USE_EVENT_BACKGROUND_THREAD
GHOST_TimerManager *GHOST_SystemWayland::ghost_timer_manager()
{
return display_->ghost_timer_manager;
}
#endif
/** \} */
/* -------------------------------------------------------------------- */

View File

@@ -165,6 +165,16 @@ class GHOST_SystemWayland : public GHOST_System {
bool cursor_grab_use_software_display_get(const GHOST_TGrabCursorMode mode);
#ifdef USE_EVENT_BACKGROUND_THREAD
/**
* Return a separate WAYLAND local timer manager to #GHOST_System::getTimerManager
* Manipulation & access must lock with #GHOST_WaylandSystem::server_mutex.
*
* See #GWL_Display::ghost_timer_manager doc-string for details on why this is needed.
*/
GHOST_TimerManager *ghost_timer_manager();
#endif
/* WAYLAND direct-data access. */
struct wl_display *wl_display();
@@ -233,7 +243,14 @@ class GHOST_SystemWayland : public GHOST_System {
* from running at the same time. */
std::mutex *server_mutex = nullptr;
/** Threads must lock this before manipulating timers. */
/**
* Threads must lock this before manipulating #GWL_Display::ghost_timer_manager.
*
* \note Using a separate lock to `server_mutex` is necessary because the
* server lock is already held when calling `ghost_wl_display_event_pump`.
* If manipulating the timer used the `server_mutex`, event pump can indirectly
* handle key up/down events which would lock `server_mutex` causing a dead-lock.
*/
std::mutex *timer_mutex = nullptr;
std::thread::id main_thread_id;

View File

@@ -140,7 +140,7 @@ class DATA_PT_EEVEE_light_distance(DataButtonsPanel, Panel):
class DATA_PT_EEVEE_shadow(DataButtonsPanel, Panel):
bl_label = "Shadow"
bl_options = {'DEFAULT_CLOSED'}
COMPAT_ENGINES = {'BLENDER_EEVEE', 'BLENDER_EEVEE_NEXT'}
COMPAT_ENGINES = {'BLENDER_EEVEE'}
@classmethod
def poll(cls, context):
@@ -168,8 +168,7 @@ class DATA_PT_EEVEE_shadow(DataButtonsPanel, Panel):
if light.type != 'SUN':
sub.prop(light, "shadow_buffer_clip_start", text="Clip Start")
if context.engine != 'BLENDER_EEVEE_NEXT':
col.prop(light, "shadow_buffer_bias", text="Bias")
col.prop(light, "shadow_buffer_bias", text="Bias")
class DATA_PT_EEVEE_shadow_cascaded_shadow_map(DataButtonsPanel, Panel):

View File

@@ -460,27 +460,6 @@ class RENDER_PT_eevee_shadows(RenderButtonsPanel, Panel):
col.prop(props, "light_threshold")
class RENDER_PT_eevee_next_shadows(RenderButtonsPanel, Panel):
bl_label = "Shadows"
bl_options = {'DEFAULT_CLOSED'}
COMPAT_ENGINES = {'BLENDER_EEVEE_NEXT'}
@classmethod
def poll(cls, context):
return (context.engine in cls.COMPAT_ENGINES)
def draw(self, context):
layout = self.layout
layout.use_property_split = True
scene = context.scene
props = scene.eevee
col = layout.column()
col.prop(props, "shadow_pool_size", text="Pool Size")
col.prop(props, "light_threshold")
class RENDER_PT_eevee_sampling(RenderButtonsPanel, Panel):
bl_label = "Sampling"
COMPAT_ENGINES = {'BLENDER_EEVEE'}
@@ -829,10 +808,6 @@ class RENDER_PT_simplify_viewport(RenderButtonsPanel, Panel):
col = flow.column()
col.prop(rd, "simplify_volumes", text="Volume Resolution")
if context.engine in 'BLENDER_EEVEE_NEXT':
col = flow.column()
col.prop(rd, "simplify_shadows", text="Shadow Resolution")
class RENDER_PT_simplify_render(RenderButtonsPanel, Panel):
bl_label = "Render"
@@ -860,10 +835,6 @@ class RENDER_PT_simplify_render(RenderButtonsPanel, Panel):
col = flow.column()
col.prop(rd, "simplify_child_particles_render", text="Max Child Particles")
if context.engine in 'BLENDER_EEVEE_NEXT':
col = flow.column()
col.prop(rd, "simplify_shadows_render", text="Shadow Resolution")
class RENDER_PT_simplify_greasepencil(RenderButtonsPanel, Panel, GreasePencilSimplifyPanel):
bl_label = "Grease Pencil"
@@ -898,7 +869,6 @@ classes = (
RENDER_PT_eevee_performance,
RENDER_PT_eevee_hair,
RENDER_PT_eevee_shadows,
RENDER_PT_eevee_next_shadows,
RENDER_PT_eevee_indirect_lighting,
RENDER_PT_eevee_indirect_lighting_display,
RENDER_PT_eevee_film,

View File

@@ -36,7 +36,7 @@ void BKE_mesh_legacy_convert_uvs_to_generic(Mesh *mesh);
* Move face sets to the legacy type from a generic type.
*/
void BKE_mesh_legacy_face_set_from_generic(
Mesh *mesh, blender::MutableSpan<CustomDataLayer> poly_layers_to_write);
blender::MutableSpan<CustomDataLayer> poly_layers_to_write);
/**
* Copy face sets to the generic data type from the legacy type.
*/

View File

@@ -133,19 +133,19 @@ void BKE_nlastrips_sort_strips(ListBase *strips);
/**
* Add the given NLA-Strip to the given list of strips, assuming that it
* isn't currently a member of another list, NULL, or conflicting with existing
* strips position.
* isn't currently a member of another list, NULL, or conflicting with existing
* strips position.
*/
void BKE_nlastrips_add_strip_unsafe(ListBase *strips, struct NlaStrip *strip);
/**
* @brief NULL checks incoming strip and verifies no overlap / invalid
* configuration against other strips in NLA Track.
* \brief NULL checks incoming strip and verifies no overlap / invalid
* configuration against other strips in NLA Track.
*
* @param strips
* @param strip
* @return true
* @return false
* \param strips:
* \param strip:
* \return true
* \return false
*/
bool BKE_nlastrips_add_strip(ListBase *strips, struct NlaStrip *strip);

View File

@@ -167,6 +167,7 @@ typedef enum {
PBVH_TopologyUpdated = 1 << 17, /* Used internally by pbvh_bmesh.c */
} PBVHNodeFlags;
ENUM_OPERATORS(PBVHNodeFlags, PBVH_TopologyUpdated);
typedef struct PBVHFrustumPlanes {
float (*planes)[4];

View File

@@ -248,9 +248,9 @@ set(SRC
intern/particle_child.c
intern/particle_distribute.c
intern/particle_system.c
intern/pbvh.c
intern/pbvh.cc
intern/pbvh_colors.cc
intern/pbvh_bmesh.c
intern/pbvh_bmesh.cc
intern/pbvh_pixels.cc
intern/pbvh_uv_islands.cc
intern/pointcache.c
@@ -505,7 +505,7 @@ set(SRC
intern/multires_reshape.hh
intern/multires_unsubdivide.h
intern/ocean_intern.h
intern/pbvh_intern.h
intern/pbvh_intern.hh
intern/pbvh_uv_islands.hh
intern/subdiv_converter.h
intern/subdiv_inline.h

View File

@@ -131,7 +131,7 @@ static void cdDM_recalc_looptri(DerivedMesh *dm)
BLI_assert(cddm->dm.looptris.array == NULL);
atomic_cas_ptr(
(void **)&cddm->dm.looptris.array, cddm->dm.looptris.array, cddm->dm.looptris.array_wip);
cddm->dm.looptris.array_wip = NULL;
cddm->dm.looptris.array_wip = nullptr;
}
static void cdDM_free_internal(CDDerivedMesh *cddm)
@@ -232,7 +232,7 @@ static DerivedMesh *cdDM_from_mesh_ex(Mesh *mesh,
#if 0
cddm->mface = CustomData_get_layer(&dm->faceData, CD_MFACE);
#else
cddm->mface = NULL;
cddm->mface = nullptr;
#endif
/* commented since even when CD_ORIGINDEX was first added this line fails

View File

@@ -1273,6 +1273,10 @@ bGPDframe *BKE_gpencil_layer_frame_get(bGPDlayer *gpl, int cframe, eGP_GetFrame_
gpl->actframe = gpf;
}
else if (addnew == GP_GETFRAME_ADD_COPY) {
/* The frame_addcopy function copies the active frame of gpl,
so we need to set the active frame before copying.
*/
gpl->actframe = gpf;
gpl->actframe = BKE_gpencil_frame_addcopy(gpl, cframe);
}
else {
@@ -1300,6 +1304,10 @@ bGPDframe *BKE_gpencil_layer_frame_get(bGPDlayer *gpl, int cframe, eGP_GetFrame_
gpl->actframe = gpf;
}
else if (addnew == GP_GETFRAME_ADD_COPY) {
/* The frame_addcopy function copies the active frame of gpl;
so we need to set the active frame before copying.
*/
gpl->actframe = gpf;
gpl->actframe = BKE_gpencil_frame_addcopy(gpl, cframe);
}
else {

View File

@@ -270,7 +270,6 @@ static void mesh_blend_write(BlendWriter *writer, ID *id, const void *id_address
BKE_mesh_legacy_convert_selection_layers_to_flags(mesh);
BKE_mesh_legacy_convert_material_indices_to_mpoly(mesh);
BKE_mesh_legacy_bevel_weight_from_layers(mesh);
BKE_mesh_legacy_face_set_from_generic(mesh, poly_layers);
BKE_mesh_legacy_edge_crease_from_layers(mesh);
BKE_mesh_legacy_sharp_edges_to_flags(mesh);
BKE_mesh_legacy_attribute_strings_to_flags(mesh);
@@ -292,6 +291,7 @@ static void mesh_blend_write(BlendWriter *writer, ID *id, const void *id_address
if (!BLO_write_is_undo(writer)) {
BKE_mesh_legacy_convert_uvs_to_struct(mesh, temp_arrays_for_legacy_format, loop_layers);
BKE_mesh_legacy_face_set_from_generic(poly_layers);
}
}

View File

@@ -1224,23 +1224,26 @@ void BKE_mesh_tessface_ensure(struct Mesh *mesh)
/** \name Face Set Conversion
* \{ */
void BKE_mesh_legacy_face_set_from_generic(Mesh *mesh,
blender::MutableSpan<CustomDataLayer> poly_layers)
void BKE_mesh_legacy_face_set_from_generic(blender::MutableSpan<CustomDataLayer> poly_layers)
{
using namespace blender;
void *faceset_data = nullptr;
bool changed = false;
for (CustomDataLayer &layer : poly_layers) {
if (StringRef(layer.name) == ".sculpt_face_set") {
faceset_data = layer.data;
layer.data = nullptr;
CustomData_free_layer_named(&mesh->pdata, ".sculpt_face_set", mesh->totpoly);
layer.type = CD_SCULPT_FACE_SETS;
layer.name[0] = '\0';
changed = true;
break;
}
}
if (faceset_data != nullptr) {
CustomData_add_layer(
&mesh->pdata, CD_SCULPT_FACE_SETS, CD_ASSIGN, faceset_data, mesh->totpoly);
if (!changed) {
return;
}
/* #CustomData expects the layers to be sorted in increasing order based on type. */
std::stable_sort(
poly_layers.begin(),
poly_layers.end(),
[](const CustomDataLayer &a, const CustomDataLayer &b) { return a.type < b.type; });
}
void BKE_mesh_legacy_face_set_to_generic(Mesh *mesh)

View File

@@ -204,7 +204,7 @@ Mesh *BKE_mesh_mirror_apply_mirror_on_axis_for_modifier(MirrorModifierData *mmd,
/* Subdivision-surface for eg won't have mesh data in the custom-data arrays.
* Now add position/#MEdge/#MPoly layers. */
if (BKE_mesh_vert_positions(mesh) != NULL) {
if (BKE_mesh_vert_positions(mesh) != nullptr) {
memcpy(BKE_mesh_vert_positions_for_write(result),
BKE_mesh_vert_positions(mesh),
sizeof(float[3]) * mesh->totvert);

View File

@@ -787,7 +787,6 @@ void BKE_nlastrips_add_strip_unsafe(ListBase *strips, NlaStrip *strip)
}
}
/** NULL and Space check before adding in nlastrip */
bool BKE_nlastrips_add_strip(ListBase *strips, NlaStrip *strip)
{
if (ELEM(NULL, strips, strip)) {

View File

@@ -50,7 +50,7 @@ TEST(nla_strip, BKE_nlastrips_add_strip)
NlaStrip strip2{};
strip2.start = 5;
strip2.end = 10;
/* can't add a null NLA strip to an NLA Track. */
EXPECT_FALSE(BKE_nlastrips_add_strip(&strips, NULL));

View File

@@ -20,7 +20,7 @@
#include "DRW_pbvh.h"
#include "bmesh.h"
#include "pbvh_intern.h"
#include "pbvh_intern.hh"
/* Avoid skinny faces */
#define USE_EDGEQUEUE_EVEN_SUBDIV
@@ -106,9 +106,9 @@ static void pbvh_bmesh_verify(PBVH *pbvh);
static void bm_edges_from_tri(BMesh *bm, BMVert *v_tri[3], BMEdge *e_tri[3])
{
e_tri[0] = BM_edge_create(bm, v_tri[0], v_tri[1], NULL, BM_CREATE_NO_DOUBLE);
e_tri[1] = BM_edge_create(bm, v_tri[1], v_tri[2], NULL, BM_CREATE_NO_DOUBLE);
e_tri[2] = BM_edge_create(bm, v_tri[2], v_tri[0], NULL, BM_CREATE_NO_DOUBLE);
e_tri[0] = BM_edge_create(bm, v_tri[0], v_tri[1], nullptr, BM_CREATE_NO_DOUBLE);
e_tri[1] = BM_edge_create(bm, v_tri[1], v_tri[2], nullptr, BM_CREATE_NO_DOUBLE);
e_tri[2] = BM_edge_create(bm, v_tri[2], v_tri[0], nullptr, BM_CREATE_NO_DOUBLE);
}
BLI_INLINE void bm_face_as_array_index_tri(BMFace *f, int r_index[3])
@@ -154,7 +154,7 @@ static BMFace *bm_face_exists_tri_from_loop_vert(BMLoop *l_radial_first, BMVert
}
} while ((l_radial_iter = l_radial_iter->radial_next) != l_radial_first);
}
return NULL;
return nullptr;
}
/**
@@ -165,13 +165,13 @@ static BMVert *bm_vert_hash_lookup_chain(GHash *deleted_verts, BMVert *v)
{
while (true) {
BMVert **v_next_p = (BMVert **)BLI_ghash_lookup_p(deleted_verts, v);
if (v_next_p == NULL) {
if (v_next_p == nullptr) {
/* Not remapped. */
return v;
}
if (*v_next_p == NULL) {
if (*v_next_p == nullptr) {
/* removed and not remapped */
return NULL;
return nullptr;
}
/* remapped */
@@ -200,7 +200,7 @@ static void pbvh_bmesh_node_finalize(PBVH *pbvh,
BB_reset(&n->vb);
GSET_ITER (gs_iter, n->bm_faces) {
BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
BMFace *f = static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
/* Update ownership of faces */
BM_ELEM_CD_SET_INT(f, cd_face_node_offset, node_index);
@@ -259,7 +259,7 @@ static void pbvh_bmesh_node_split(PBVH *pbvh, const BBC *bbc_array, int node_ind
BB_reset(&cb);
GSetIterator gs_iter;
GSET_ITER (gs_iter, n->bm_faces) {
const BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
const BMFace *f = static_cast<const BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
const BBC *bbc = &bbc_array[BM_elem_index_get(f)];
BB_expand(&cb, bbc->bcentroid);
@@ -286,7 +286,7 @@ static void pbvh_bmesh_node_split(PBVH *pbvh, const BBC *bbc_array, int node_ind
/* Partition the parent node's faces between the two children */
GSET_ITER (gs_iter, n->bm_faces) {
BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
BMFace *f = static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
const BBC *bbc = &bbc_array[BM_elem_index_get(f)];
if (bbc->bcentroid[axis] < mid) {
@@ -298,7 +298,7 @@ static void pbvh_bmesh_node_split(PBVH *pbvh, const BBC *bbc_array, int node_ind
}
/* Enforce at least one primitive in each node */
GSet *empty = NULL, *other;
GSet *empty = nullptr, *other;
if (BLI_gset_len(c1->bm_faces) == 0) {
empty = c1->bm_faces;
other = c2->bm_faces;
@@ -311,7 +311,7 @@ static void pbvh_bmesh_node_split(PBVH *pbvh, const BBC *bbc_array, int node_ind
GSET_ITER (gs_iter, other) {
void *key = BLI_gsetIterator_getKey(&gs_iter);
BLI_gset_insert(empty, key);
BLI_gset_remove(other, key, NULL);
BLI_gset_remove(other, key, nullptr);
break;
}
}
@@ -321,31 +321,31 @@ static void pbvh_bmesh_node_split(PBVH *pbvh, const BBC *bbc_array, int node_ind
/* Mark this node's unique verts as unclaimed */
if (n->bm_unique_verts) {
GSET_ITER (gs_iter, n->bm_unique_verts) {
BMVert *v = BLI_gsetIterator_getKey(&gs_iter);
BMVert *v = static_cast<BMVert *>(BLI_gsetIterator_getKey(&gs_iter));
BM_ELEM_CD_SET_INT(v, cd_vert_node_offset, DYNTOPO_NODE_NONE);
}
BLI_gset_free(n->bm_unique_verts, NULL);
BLI_gset_free(n->bm_unique_verts, nullptr);
}
/* Unclaim faces */
GSET_ITER (gs_iter, n->bm_faces) {
BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
BMFace *f = static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
BM_ELEM_CD_SET_INT(f, cd_face_node_offset, DYNTOPO_NODE_NONE);
}
BLI_gset_free(n->bm_faces, NULL);
BLI_gset_free(n->bm_faces, nullptr);
if (n->bm_other_verts) {
BLI_gset_free(n->bm_other_verts, NULL);
BLI_gset_free(n->bm_other_verts, nullptr);
}
if (n->layer_disp) {
MEM_freeN(n->layer_disp);
}
n->bm_faces = NULL;
n->bm_unique_verts = NULL;
n->bm_other_verts = NULL;
n->layer_disp = NULL;
n->bm_faces = nullptr;
n->bm_unique_verts = nullptr;
n->bm_other_verts = nullptr;
n->layer_disp = nullptr;
if (n->draw_batches) {
DRW_pbvh_node_free(n->draw_batches);
@@ -380,12 +380,12 @@ static bool pbvh_bmesh_node_limit_ensure(PBVH *pbvh, int node_index)
pbvh->draw_cache_invalid = true;
/* For each BMFace, store the AABB and AABB centroid */
BBC *bbc_array = MEM_mallocN(sizeof(BBC) * bm_faces_size, "BBC");
BBC *bbc_array = static_cast<BBC *>(MEM_mallocN(sizeof(BBC) * bm_faces_size, "BBC"));
GSetIterator gs_iter;
int i;
GSET_ITER_INDEX (gs_iter, bm_faces, i) {
BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
BMFace *f = static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
BBC *bbc = &bbc_array[i];
BB_reset((BB *)bbc);
@@ -484,7 +484,7 @@ static BMVert *pbvh_bmesh_vert_create(PBVH *pbvh,
BLI_assert((pbvh->totnode == 1 || node_index) && node_index <= pbvh->totnode);
/* avoid initializing customdata because its quite involved */
BMVert *v = BM_vert_create(pbvh->header.bm, co, NULL, BM_CREATE_SKIP_CD);
BMVert *v = BM_vert_create(pbvh->header.bm, co, nullptr, BM_CREATE_SKIP_CD);
CustomData_bmesh_set_default(&pbvh->header.bm->vdata, &v->head.data);
/* This value is logged below */
@@ -587,7 +587,7 @@ static PBVHNode *pbvh_bmesh_vert_other_node_find(PBVH *pbvh, BMVert *v)
}
BM_FACES_OF_VERT_ITER_END;
return NULL;
return nullptr;
}
static void pbvh_bmesh_vert_ownership_transfer(PBVH *pbvh, PBVHNode *new_owner, BMVert *v)
@@ -599,12 +599,12 @@ static void pbvh_bmesh_vert_ownership_transfer(PBVH *pbvh, PBVHNode *new_owner,
BLI_assert(current_owner != new_owner);
/* Remove current ownership */
BLI_gset_remove(current_owner->bm_unique_verts, v, NULL);
BLI_gset_remove(current_owner->bm_unique_verts, v, nullptr);
/* Set new ownership */
BM_ELEM_CD_SET_INT(v, pbvh->cd_vert_node_offset, new_owner - pbvh->nodes);
BLI_gset_insert(new_owner->bm_unique_verts, v);
BLI_gset_remove(new_owner->bm_other_verts, v, NULL);
BLI_gset_remove(new_owner->bm_other_verts, v, nullptr);
BLI_assert(!BLI_gset_haskey(new_owner->bm_other_verts, v));
/* mark node for update */
@@ -617,7 +617,7 @@ static void pbvh_bmesh_vert_remove(PBVH *pbvh, BMVert *v)
int f_node_index_prev = DYNTOPO_NODE_NONE;
PBVHNode *v_node = pbvh_bmesh_node_from_vert(pbvh, v);
BLI_gset_remove(v_node->bm_unique_verts, v, NULL);
BLI_gset_remove(v_node->bm_unique_verts, v, nullptr);
BM_ELEM_CD_SET_INT(v, pbvh->cd_vert_node_offset, DYNTOPO_NODE_NONE);
/* Have to check each neighboring face's node */
@@ -634,7 +634,7 @@ static void pbvh_bmesh_vert_remove(PBVH *pbvh, BMVert *v)
f_node->flag |= PBVH_UpdateDrawBuffers | PBVH_UpdateBB | PBVH_TopologyUpdated;
/* Remove current ownership */
BLI_gset_remove(f_node->bm_other_verts, v, NULL);
BLI_gset_remove(f_node->bm_other_verts, v, nullptr);
BLI_assert(!BLI_gset_haskey(f_node->bm_unique_verts, v));
BLI_assert(!BLI_gset_haskey(f_node->bm_other_verts, v));
@@ -667,13 +667,13 @@ static void pbvh_bmesh_face_remove(PBVH *pbvh, BMFace *f)
}
else {
/* Remove from other verts */
BLI_gset_remove(f_node->bm_other_verts, v, NULL);
BLI_gset_remove(f_node->bm_other_verts, v, nullptr);
}
}
} while ((l_iter = l_iter->next) != l_first);
/* Remove face from node and top level */
BLI_gset_remove(f_node->bm_faces, f, NULL);
BLI_gset_remove(f_node->bm_faces, f, nullptr);
BM_ELEM_CD_SET_INT(f, pbvh->cd_face_node_offset, DYNTOPO_NODE_NONE);
/* Log removed face */
@@ -688,14 +688,14 @@ static void pbvh_bmesh_edge_loops(BLI_Buffer *buf, BMEdge *e)
/* fast-path for most common case where an edge has 2 faces,
* no need to iterate twice.
* This assumes that the buffer */
BMLoop **data = buf->data;
BMLoop **data = static_cast<BMLoop **>(buf->data);
BLI_assert(buf->alloc_count >= 2);
if (LIKELY(BM_edge_loop_pair(e, &data[0], &data[1]))) {
buf->count = 2;
}
else {
BLI_buffer_reinit(buf, BM_edge_face_count(e));
BM_iter_as_array(NULL, BM_LOOPS_OF_EDGE, e, buf->data, buf->count);
BM_iter_as_array(nullptr, BM_LOOPS_OF_EDGE, e, static_cast<void **>(buf->data), buf->count);
}
}
@@ -709,9 +709,7 @@ static void pbvh_bmesh_node_drop_orig(PBVHNode *node)
/****************************** EdgeQueue *****************************/
struct EdgeQueue;
typedef struct EdgeQueue {
struct EdgeQueue {
HeapSimple *heap;
const float *center;
float center_proj[3]; /* for when we use projected coords. */
@@ -721,22 +719,22 @@ typedef struct EdgeQueue {
float limit_len;
#endif
bool (*edge_queue_tri_in_range)(const struct EdgeQueue *q, BMFace *f);
bool (*edge_queue_tri_in_range)(const EdgeQueue *q, BMFace *f);
const float *view_normal;
#ifdef USE_EDGEQUEUE_FRONTFACE
uint use_view_normal : 1;
#endif
} EdgeQueue;
};
typedef struct {
struct EdgeQueueContext {
EdgeQueue *q;
BLI_mempool *pool;
BMesh *bm;
int cd_vert_mask_offset;
int cd_vert_node_offset;
int cd_face_node_offset;
} EdgeQueueContext;
};
/* only tag'd edges are in the queue */
#ifdef USE_EDGEQUEUE_TAG
@@ -828,7 +826,7 @@ static void edge_queue_insert(EdgeQueueContext *eq_ctx, BMEdge *e, float priorit
(check_mask(eq_ctx, e->v1) || check_mask(eq_ctx, e->v2))) &&
!(BM_elem_flag_test_bool(e->v1, BM_ELEM_HIDDEN) ||
BM_elem_flag_test_bool(e->v2, BM_ELEM_HIDDEN))) {
BMVert **pair = BLI_mempool_alloc(eq_ctx->pool);
BMVert **pair = static_cast<BMVert **>(BLI_mempool_alloc(eq_ctx->pool));
pair[0] = e->v1;
pair[1] = e->v2;
BLI_heapsimple_insert(eq_ctx->q->heap, priority, pair);
@@ -1028,7 +1026,7 @@ static void long_edge_queue_create(EdgeQueueContext *eq_ctx,
/* Check each face */
GSET_ITER (gs_iter, node->bm_faces) {
BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
BMFace *f = static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
long_edge_queue_face_add(eq_ctx, f);
}
@@ -1087,7 +1085,7 @@ static void short_edge_queue_create(EdgeQueueContext *eq_ctx,
/* Check each face */
GSET_ITER (gs_iter, node->bm_faces) {
BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
BMFace *f = static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
short_edge_queue_face_add(eq_ctx, f);
}
@@ -1187,9 +1185,9 @@ static void pbvh_bmesh_split_edge(EdgeQueueContext *eq_ctx,
v_tri[0] = v_new;
v_tri[1] = v2;
/* v_tri[2] = v_opp; */ /* unchanged */
e_tri[0] = BM_edge_create(pbvh->header.bm, v_tri[0], v_tri[1], NULL, BM_CREATE_NO_DOUBLE);
e_tri[0] = BM_edge_create(pbvh->header.bm, v_tri[0], v_tri[1], nullptr, BM_CREATE_NO_DOUBLE);
e_tri[2] = e_tri[1]; /* switched */
e_tri[1] = BM_edge_create(pbvh->header.bm, v_tri[1], v_tri[2], NULL, BM_CREATE_NO_DOUBLE);
e_tri[1] = BM_edge_create(pbvh->header.bm, v_tri[1], v_tri[2], nullptr, BM_CREATE_NO_DOUBLE);
f_new = pbvh_bmesh_face_create(pbvh, ni, v_tri, e_tri, f_adj);
long_edge_queue_face_add(eq_ctx, f_new);
@@ -1222,12 +1220,12 @@ static bool pbvh_bmesh_subdivide_long_edges(EdgeQueueContext *eq_ctx,
bool any_subdivided = false;
while (!BLI_heapsimple_is_empty(eq_ctx->q->heap)) {
BMVert **pair = BLI_heapsimple_pop_min(eq_ctx->q->heap);
BMVert **pair = static_cast<BMVert **>(BLI_heapsimple_pop_min(eq_ctx->q->heap));
BMVert *v1 = pair[0], *v2 = pair[1];
BMEdge *e;
BLI_mempool_free(eq_ctx->pool, pair);
pair = NULL;
pair = nullptr;
/* Check that the edge still exists */
if (!(e = BM_edge_exists(v1, v2))) {
@@ -1318,7 +1316,7 @@ static void pbvh_bmesh_collapse_edge(PBVH *pbvh,
BMFace *existing_face;
/* Get vertices, replace use of v_del with v_conn */
// BM_iter_as_array(NULL, BM_VERTS_OF_FACE, f, (void **)v_tri, 3);
// BM_iter_as_array(nullptr, BM_VERTS_OF_FACE, f, (void **)v_tri, 3);
BMFace *f = l->f;
#if 0
BMVert *v_tri[3];
@@ -1396,15 +1394,15 @@ static void pbvh_bmesh_collapse_edge(PBVH *pbvh,
/* Check if any of the face's vertices are now unused, if so
* remove them from the PBVH */
for (int j = 0; j < 3; j++) {
if ((v_tri[j] != v_del) && (v_tri[j]->e == NULL)) {
if ((v_tri[j] != v_del) && (v_tri[j]->e == nullptr)) {
pbvh_bmesh_vert_remove(pbvh, v_tri[j]);
BM_log_vert_removed(pbvh->bm_log, v_tri[j], eq_ctx->cd_vert_mask_offset);
if (v_tri[j] == v_conn) {
v_conn = NULL;
v_conn = nullptr;
}
BLI_ghash_insert(deleted_verts, v_tri[j], NULL);
BLI_ghash_insert(deleted_verts, v_tri[j], nullptr);
BM_vert_kill(pbvh->header.bm, v_tri[j]);
}
}
@@ -1412,7 +1410,7 @@ static void pbvh_bmesh_collapse_edge(PBVH *pbvh,
/* Move v_conn to the midpoint of v_conn and v_del (if v_conn still exists, it
* may have been deleted above) */
if (v_conn != NULL) {
if (v_conn != nullptr) {
BM_log_vert_before_modified(pbvh->bm_log, v_conn, eq_ctx->cd_vert_mask_offset);
mid_v3_v3v3(v_conn->co, v_conn->co, v_del->co);
add_v3_v3(v_conn->no, v_del->no);
@@ -1430,7 +1428,7 @@ static void pbvh_bmesh_collapse_edge(PBVH *pbvh,
/* Delete v_del */
BLI_assert(!BM_vert_face_check(v_del));
BM_log_vert_removed(pbvh->bm_log, v_del, eq_ctx->cd_vert_mask_offset);
/* v_conn == NULL is OK */
/* v_conn == nullptr is OK */
BLI_ghash_insert(deleted_verts, v_del, v_conn);
BM_vert_kill(pbvh->header.bm, v_del);
}
@@ -1441,14 +1439,14 @@ static bool pbvh_bmesh_collapse_short_edges(EdgeQueueContext *eq_ctx,
{
const float min_len_squared = pbvh->bm_min_edge_len * pbvh->bm_min_edge_len;
bool any_collapsed = false;
/* deleted verts point to vertices they were merged into, or NULL when removed. */
/* deleted verts point to vertices they were merged into, or nullptr when removed. */
GHash *deleted_verts = BLI_ghash_ptr_new("deleted_verts");
while (!BLI_heapsimple_is_empty(eq_ctx->q->heap)) {
BMVert **pair = BLI_heapsimple_pop_min(eq_ctx->q->heap);
BMVert **pair = static_cast<BMVert **>(BLI_heapsimple_pop_min(eq_ctx->q->heap));
BMVert *v1 = pair[0], *v2 = pair[1];
BLI_mempool_free(eq_ctx->pool, pair);
pair = NULL;
pair = nullptr;
/* Check the verts still exist */
if (!(v1 = bm_vert_hash_lookup_chain(deleted_verts, v1)) ||
@@ -1483,17 +1481,17 @@ static bool pbvh_bmesh_collapse_short_edges(EdgeQueueContext *eq_ctx,
pbvh_bmesh_collapse_edge(pbvh, e, v1, v2, deleted_verts, deleted_faces, eq_ctx);
}
BLI_ghash_free(deleted_verts, NULL, NULL);
BLI_ghash_free(deleted_verts, nullptr, nullptr);
return any_collapsed;
}
/************************* Called from pbvh.c *************************/
/************************* Called from pbvh.cc *************************/
bool pbvh_bmesh_node_raycast(PBVHNode *node,
const float ray_start[3],
const float ray_normal[3],
struct IsectRayPrecalc *isect_precalc,
IsectRayPrecalc *isect_precalc,
float *depth,
bool use_original,
PBVHVertRef *r_active_vertex,
@@ -1530,7 +1528,7 @@ bool pbvh_bmesh_node_raycast(PBVHNode *node,
if (j == 0 || len_squared_v3v3(location, cos[j]) <
len_squared_v3v3(location, nearest_vertex_co)) {
copy_v3_v3(nearest_vertex_co, cos[j]);
r_active_vertex->i = (intptr_t)node->bm_orvert[node->bm_ortri[i][j]];
r_active_vertex->i = intptr_t(node->bm_orvert[node->bm_ortri[i][j]]);
}
}
}
@@ -1539,7 +1537,7 @@ bool pbvh_bmesh_node_raycast(PBVHNode *node,
}
else {
GSET_ITER (gs_iter, node->bm_faces) {
BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
BMFace *f = static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
BLI_assert(f->len == 3);
@@ -1562,7 +1560,7 @@ bool pbvh_bmesh_node_raycast(PBVHNode *node,
if (j == 0 || len_squared_v3v3(location, v_tri[j]->co) <
len_squared_v3v3(location, nearest_vertex_co)) {
copy_v3_v3(nearest_vertex_co, v_tri[j]->co);
r_active_vertex->i = (intptr_t)v_tri[j];
r_active_vertex->i = intptr_t(v_tri[j]);
}
}
}
@@ -1576,7 +1574,7 @@ bool pbvh_bmesh_node_raycast(PBVHNode *node,
bool BKE_pbvh_bmesh_node_raycast_detail(PBVHNode *node,
const float ray_start[3],
struct IsectRayPrecalc *isect_precalc,
IsectRayPrecalc *isect_precalc,
float *depth,
float *r_edge_length)
{
@@ -1586,10 +1584,10 @@ bool BKE_pbvh_bmesh_node_raycast_detail(PBVHNode *node,
GSetIterator gs_iter;
bool hit = false;
BMFace *f_hit = NULL;
BMFace *f_hit = nullptr;
GSET_ITER (gs_iter, node->bm_faces) {
BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
BMFace *f = static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
BLI_assert(f->len == 3);
if (!BM_elem_flag_test(f, BM_ELEM_HIDDEN)) {
@@ -1645,7 +1643,7 @@ bool pbvh_bmesh_node_nearest_to_ray(PBVHNode *node,
GSetIterator gs_iter;
GSET_ITER (gs_iter, node->bm_faces) {
BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
BMFace *f = static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
BLI_assert(f->len == 3);
if (!BM_elem_flag_test(f, BM_ELEM_HIDDEN)) {
@@ -1670,14 +1668,14 @@ void pbvh_bmesh_normals_update(PBVHNode **nodes, int totnode)
GSetIterator gs_iter;
GSET_ITER (gs_iter, node->bm_faces) {
BM_face_normal_update(BLI_gsetIterator_getKey(&gs_iter));
BM_face_normal_update(static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter)));
}
GSET_ITER (gs_iter, node->bm_unique_verts) {
BM_vert_normal_update(BLI_gsetIterator_getKey(&gs_iter));
BM_vert_normal_update(static_cast<BMVert *>(BLI_gsetIterator_getKey(&gs_iter)));
}
/* This should be unneeded normally */
GSET_ITER (gs_iter, node->bm_other_verts) {
BM_vert_normal_update(BLI_gsetIterator_getKey(&gs_iter));
BM_vert_normal_update(static_cast<BMVert *>(BLI_gsetIterator_getKey(&gs_iter)));
}
node->flag &= ~PBVH_UpdateNormals;
}
@@ -1687,8 +1685,8 @@ void pbvh_bmesh_normals_update(PBVHNode **nodes, int totnode)
struct FastNodeBuildInfo {
int totface; /* number of faces */
int start; /* start of faces in array */
struct FastNodeBuildInfo *child1;
struct FastNodeBuildInfo *child2;
FastNodeBuildInfo *child1;
FastNodeBuildInfo *child2;
};
/**
@@ -1697,9 +1695,9 @@ struct FastNodeBuildInfo {
* to a sub part of the arrays.
*/
static void pbvh_bmesh_node_limit_ensure_fast(
PBVH *pbvh, BMFace **nodeinfo, BBC *bbc_array, struct FastNodeBuildInfo *node, MemArena *arena)
PBVH *pbvh, BMFace **nodeinfo, BBC *bbc_array, FastNodeBuildInfo *node, MemArena *arena)
{
struct FastNodeBuildInfo *child1, *child2;
FastNodeBuildInfo *child1, *child2;
if (node->totface <= pbvh->leaf_limit) {
return;
@@ -1778,21 +1776,23 @@ static void pbvh_bmesh_node_limit_ensure_fast(
* each sequential part belonging to one node only */
BLI_assert((num_child1 + num_child2) == node->totface);
node->child1 = child1 = BLI_memarena_alloc(arena, sizeof(struct FastNodeBuildInfo));
node->child2 = child2 = BLI_memarena_alloc(arena, sizeof(struct FastNodeBuildInfo));
node->child1 = child1 = static_cast<FastNodeBuildInfo *>(
BLI_memarena_alloc(arena, sizeof(FastNodeBuildInfo)));
node->child2 = child2 = static_cast<FastNodeBuildInfo *>(
BLI_memarena_alloc(arena, sizeof(FastNodeBuildInfo)));
child1->totface = num_child1;
child1->start = node->start;
child2->totface = num_child2;
child2->start = node->start + num_child1;
child1->child1 = child1->child2 = child2->child1 = child2->child2 = NULL;
child1->child1 = child1->child2 = child2->child1 = child2->child2 = nullptr;
pbvh_bmesh_node_limit_ensure_fast(pbvh, nodeinfo, bbc_array, child1, arena);
pbvh_bmesh_node_limit_ensure_fast(pbvh, nodeinfo, bbc_array, child2, arena);
}
static void pbvh_bmesh_create_nodes_fast_recursive(
PBVH *pbvh, BMFace **nodeinfo, BBC *bbc_array, struct FastNodeBuildInfo *node, int node_index)
PBVH *pbvh, BMFace **nodeinfo, BBC *bbc_array, FastNodeBuildInfo *node, int node_index)
{
PBVHNode *n = pbvh->nodes + node_index;
/* two cases, node does not have children or does have children */
@@ -1910,8 +1910,9 @@ void BKE_pbvh_build_bmesh(PBVH *pbvh,
}
/* bounding box array of all faces, no need to recalculate every time */
BBC *bbc_array = MEM_mallocN(sizeof(BBC) * bm->totface, "BBC");
BMFace **nodeinfo = MEM_mallocN(sizeof(*nodeinfo) * bm->totface, "nodeinfo");
BBC *bbc_array = static_cast<BBC *>(MEM_mallocN(sizeof(BBC) * bm->totface, "BBC"));
BMFace **nodeinfo = static_cast<BMFace **>(
MEM_mallocN(sizeof(*nodeinfo) * bm->totface, "nodeinfo"));
MemArena *arena = BLI_memarena_new(BLI_MEMARENA_STD_BUFSIZE, "fast PBVH node storage");
BMIter iter;
@@ -1942,7 +1943,7 @@ void BKE_pbvh_build_bmesh(PBVH *pbvh,
}
/* setup root node */
struct FastNodeBuildInfo rootnode = {0};
FastNodeBuildInfo rootnode = {0};
rootnode.totface = bm->totface;
/* start recursion, assign faces to nodes accordingly */
@@ -1952,7 +1953,7 @@ void BKE_pbvh_build_bmesh(PBVH *pbvh,
* next we need to assign those to the gsets of the nodes. */
/* Start with all faces in the root node */
pbvh->nodes = MEM_callocN(sizeof(PBVHNode), "PBVHNode");
pbvh->nodes = MEM_cnew<PBVHNode>(__func__);
pbvh->totnode = 1;
/* take root node and visit and populate children recursively */
@@ -1999,7 +2000,7 @@ bool BKE_pbvh_bmesh_update_topology(PBVH *pbvh,
short_edge_queue_create(
&eq_ctx, pbvh, center, view_normal, radius, use_frontface, use_projected);
modified |= pbvh_bmesh_collapse_short_edges(&eq_ctx, pbvh, &deleted_faces);
BLI_heapsimple_free(q.heap, NULL);
BLI_heapsimple_free(q.heap, nullptr);
BLI_mempool_destroy(queue_pool);
}
@@ -2018,7 +2019,7 @@ bool BKE_pbvh_bmesh_update_topology(PBVH *pbvh,
long_edge_queue_create(
&eq_ctx, pbvh, center, view_normal, radius, use_frontface, use_projected);
modified |= pbvh_bmesh_subdivide_long_edges(&eq_ctx, pbvh, &edge_loops);
BLI_heapsimple_free(q.heap, NULL);
BLI_heapsimple_free(q.heap, nullptr);
BLI_mempool_destroy(queue_pool);
}
@@ -2066,15 +2067,17 @@ void BKE_pbvh_bmesh_node_save_orig(BMesh *bm, BMLog *log, PBVHNode *node, bool u
const int tottri = BLI_gset_len(node->bm_faces);
node->bm_orco = MEM_mallocN(sizeof(*node->bm_orco) * totvert, __func__);
node->bm_ortri = MEM_mallocN(sizeof(*node->bm_ortri) * tottri, __func__);
node->bm_orvert = MEM_mallocN(sizeof(*node->bm_orvert) * totvert, __func__);
node->bm_orco = static_cast<float(*)[3]>(
MEM_mallocN(sizeof(*node->bm_orco) * totvert, __func__));
node->bm_ortri = static_cast<int(*)[3]>(MEM_mallocN(sizeof(*node->bm_ortri) * tottri, __func__));
node->bm_orvert = static_cast<BMVert **>(
MEM_mallocN(sizeof(*node->bm_orvert) * totvert, __func__));
/* Copy out the vertices and assign a temporary index */
int i = 0;
GSetIterator gs_iter;
GSET_ITER (gs_iter, node->bm_unique_verts) {
BMVert *v = BLI_gsetIterator_getKey(&gs_iter);
BMVert *v = static_cast<BMVert *>(BLI_gsetIterator_getKey(&gs_iter));
const float *origco = BM_log_original_vert_co(log, v);
if (use_original && origco) {
@@ -2089,7 +2092,7 @@ void BKE_pbvh_bmesh_node_save_orig(BMesh *bm, BMLog *log, PBVHNode *node, bool u
i++;
}
GSET_ITER (gs_iter, node->bm_other_verts) {
BMVert *v = BLI_gsetIterator_getKey(&gs_iter);
BMVert *v = static_cast<BMVert *>(BLI_gsetIterator_getKey(&gs_iter));
const float *origco = BM_log_original_vert_co(log, v);
if (use_original && origco) {
@@ -2109,7 +2112,7 @@ void BKE_pbvh_bmesh_node_save_orig(BMesh *bm, BMLog *log, PBVHNode *node, bool u
/* Copy the triangles */
i = 0;
GSET_ITER (gs_iter, node->bm_faces) {
BMFace *f = BLI_gsetIterator_getKey(&gs_iter);
BMFace *f = static_cast<BMFace *>(BLI_gsetIterator_getKey(&gs_iter));
if (BM_elem_flag_test(f, BM_ELEM_HIDDEN)) {
continue;
@@ -2167,7 +2170,7 @@ GSet *BKE_pbvh_bmesh_node_other_verts(PBVHNode *node)
return node->bm_other_verts;
}
struct GSet *BKE_pbvh_bmesh_node_faces(PBVHNode *node)
GSet *BKE_pbvh_bmesh_node_faces(PBVHNode *node)
{
return node->bm_faces;
}
@@ -2318,14 +2321,14 @@ static void pbvh_bmesh_verify(PBVH *pbvh)
* adjacent faces */
bool found = false;
BMIter bm_iter;
BMFace *f = NULL;
BMFace *f = nullptr;
BM_ITER_ELEM (f, &bm_iter, v, BM_FACES_OF_VERT) {
if (pbvh_bmesh_node_lookup(pbvh, f) == n) {
found = true;
break;
}
}
BLI_assert(found || f == NULL);
BLI_assert(found || f == nullptr);
# if 1
/* total freak stuff, check if node exists somewhere else */
@@ -2347,7 +2350,7 @@ static void pbvh_bmesh_verify(PBVH *pbvh)
bool has_unique = false;
for (int i = 0; i < pbvh->totnode; i++) {
PBVHNode *n = &pbvh->nodes[i];
if ((n->bm_unique_verts != NULL) && BLI_gset_haskey(n->bm_unique_verts, vi)) {
if ((n->bm_unique_verts != nullptr) && BLI_gset_haskey(n->bm_unique_verts, vi)) {
has_unique = true;
}
}
@@ -2389,8 +2392,8 @@ static void pbvh_bmesh_verify(PBVH *pbvh)
}
}
BLI_gset_free(faces_all, NULL);
BLI_gset_free(verts_all, NULL);
BLI_gset_free(faces_all, nullptr);
BLI_gset_free(verts_all, nullptr);
}
#endif

View File

@@ -33,7 +33,7 @@
#include "atomic_ops.h"
#include "pbvh_intern.h"
#include "pbvh_intern.hh"
#include <climits>

View File

@@ -2,37 +2,31 @@
#pragma once
struct PBVHGPUFormat;
/** \file
* \ingroup bke
*/
#ifdef __cplusplus
extern "C" {
#endif
struct PBVHGPUFormat;
struct MLoop;
struct MLoopTri;
struct MPoly;
struct MeshElemMap;
/* Axis-aligned bounding box */
typedef struct {
struct BB {
float bmin[3], bmax[3];
} BB;
};
/* Axis-aligned bounding box with centroid */
typedef struct {
struct BBC {
float bmin[3], bmax[3], bcentroid[3];
} BBC;
struct MeshElemMap;
};
/* NOTE: this structure is getting large, might want to split it into
* union'd structs */
struct PBVHNode {
/* Opaque handle for drawing code */
struct PBVHBatches *draw_batches;
PBVHBatches *draw_batches;
/* Voxel bounds */
BB vb;
@@ -95,7 +89,7 @@ struct PBVHNode {
/* Indicates whether this node is a leaf or not; also used for
* marking various updates that need to be applied. */
PBVHNodeFlags flag : 32;
PBVHNodeFlags flag;
/* Used for ray-casting: how close the bounding-box is to the ray point. */
float tmin;
@@ -132,12 +126,15 @@ struct PBVHNode {
int debug_draw_gen;
};
typedef enum { PBVH_DYNTOPO_SMOOTH_SHADING = 1 } PBVHFlags;
enum PBVHFlags {
PBVH_DYNTOPO_SMOOTH_SHADING = 1,
};
ENUM_OPERATORS(PBVHFlags, PBVH_DYNTOPO_SMOOTH_SHADING);
typedef struct PBVHBMeshLog PBVHBMeshLog;
struct PBVH {
struct PBVHPublic header;
PBVHPublic header;
PBVHFlags flags;
PBVHNode *nodes;
@@ -154,18 +151,18 @@ struct PBVH {
int depth_limit;
/* Mesh data */
struct Mesh *mesh;
Mesh *mesh;
/* NOTE: Normals are not `const` because they can be updated for drawing by sculpt code. */
float (*vert_normals)[3];
bool *hide_vert;
float (*vert_positions)[3];
const struct MPoly *mpoly;
const MPoly *mpoly;
bool *hide_poly;
/** Material indices. Only valid for polygon meshes. */
const int *material_indices;
const struct MLoop *mloop;
const struct MLoopTri *looptri;
const MLoop *mloop;
const MLoopTri *looptri;
CustomData *vdata;
CustomData *ldata;
CustomData *pdata;
@@ -203,10 +200,10 @@ struct PBVH {
float planes[6][4];
int num_planes;
struct BMLog *bm_log;
struct SubdivCCG *subdiv_ccg;
BMLog *bm_log;
SubdivCCG *subdiv_ccg;
const struct MeshElemMap *pmap;
const MeshElemMap *pmap;
CustomDataLayer *color_layer;
eAttrDomain color_domain;
@@ -216,12 +213,12 @@ struct PBVH {
/* Used by DynTopo to invalidate the draw cache. */
bool draw_cache_invalid;
struct PBVHGPUFormat *vbo_id;
PBVHGPUFormat *vbo_id;
PBVHPixels pixels;
};
/* pbvh.c */
/* pbvh.cc */
void BB_reset(BB *bb);
/**
@@ -239,14 +236,14 @@ void BBC_update_centroid(BBC *bbc);
int BB_widest_axis(const BB *bb);
void pbvh_grow_nodes(PBVH *bvh, int totnode);
bool ray_face_intersection_quad(const float ray_start[3],
struct IsectRayPrecalc *isect_precalc,
IsectRayPrecalc *isect_precalc,
const float t0[3],
const float t1[3],
const float t2[3],
const float t3[3],
float *depth);
bool ray_face_intersection_tri(const float ray_start[3],
struct IsectRayPrecalc *isect_precalc,
IsectRayPrecalc *isect_precalc,
const float t0[3],
const float t1[3],
const float t2[3],
@@ -270,12 +267,12 @@ bool ray_face_nearest_tri(const float ray_start[3],
void pbvh_update_BB_redraw(PBVH *bvh, PBVHNode **nodes, int totnode, int flag);
/* pbvh_bmesh.c */
/* pbvh_bmesh.cc */
bool pbvh_bmesh_node_raycast(PBVHNode *node,
const float ray_start[3],
const float ray_normal[3],
struct IsectRayPrecalc *isect_precalc,
IsectRayPrecalc *isect_precalc,
float *dist,
bool use_original,
PBVHVertRef *r_active_vertex,
@@ -295,7 +292,3 @@ void pbvh_node_pixels_free(PBVHNode *node);
void pbvh_pixels_free(PBVH *pbvh);
void pbvh_pixels_free_brush_test(PBVHNode *node);
void pbvh_free_draw_buffers(PBVH *pbvh, PBVHNode *node);
#ifdef __cplusplus
}
#endif

View File

@@ -22,7 +22,7 @@
#include "bmesh.h"
#include "pbvh_intern.h"
#include "pbvh_intern.hh"
#include "pbvh_uv_islands.hh"
namespace blender::bke::pbvh::pixels {
@@ -116,7 +116,7 @@ static void split_pixel_node(
const int axis = BB_widest_axis(&cb);
const float mid = (cb.bmax[axis] + cb.bmin[axis]) * 0.5f;
node->flag = (PBVHNodeFlags)((int)node->flag & (int)~PBVH_TexLeaf);
node->flag = (PBVHNodeFlags)(int(node->flag) & int(~PBVH_TexLeaf));
SplitNodePair *split1 = MEM_new<SplitNodePair>("split_pixel_node split1", split);
SplitNodePair *split2 = MEM_new<SplitNodePair>("split_pixel_node split1", split);
@@ -188,7 +188,7 @@ static void split_pixel_node(
float2 delta = uv_prim.delta_barycentric_coord_u;
float2 uv1 = row.start_barycentric_coord;
float2 uv2 = row.start_barycentric_coord + delta * (float)row.num_pixels;
float2 uv2 = row.start_barycentric_coord + delta * float(row.num_pixels);
float co1[3];
float co2[3];
@@ -210,7 +210,7 @@ static void split_pixel_node(
t = (mid - co1[axis]) / (co2[axis] - co1[axis]);
}
int num_pixels = (int)floorf((float)row.num_pixels * t);
int num_pixels = int(floorf(float(row.num_pixels) * t));
if (num_pixels) {
row1.num_pixels = num_pixels;
@@ -223,7 +223,7 @@ static void split_pixel_node(
row2.num_pixels = row.num_pixels - num_pixels;
row2.start_barycentric_coord = row.start_barycentric_coord +
uv_prim.delta_barycentric_coord_u * (float)num_pixels;
uv_prim.delta_barycentric_coord_u * float(num_pixels);
row2.start_image_coordinate = row.start_image_coordinate;
row2.start_image_coordinate[0] += num_pixels;
@@ -731,7 +731,7 @@ static bool update_pixels(PBVH *pbvh, Mesh *mesh, Image *image, ImageUser *image
PBVHNode &node = pbvh->nodes[i];
if (node.flag & PBVH_Leaf) {
node.flag = (PBVHNodeFlags)((int)node.flag | (int)PBVH_TexLeaf);
node.flag = (PBVHNodeFlags)(int(node.flag) | int(PBVH_TexLeaf));
}
}
@@ -800,7 +800,6 @@ void BKE_pbvh_pixels_mark_image_dirty(PBVHNode &node, Image &image, ImageUser &i
}
} // namespace blender::bke::pbvh::pixels
extern "C" {
using namespace blender::bke::pbvh::pixels;
void BKE_pbvh_build_pixels(PBVH *pbvh, Mesh *mesh, Image *image, ImageUser *image_user)
@@ -828,4 +827,3 @@ void pbvh_pixels_free(PBVH *pbvh)
MEM_delete(pbvh_data);
pbvh->pixels.data = nullptr;
}
}

View File

@@ -161,7 +161,7 @@ static float get_edge_sharpness(const OpenSubdiv_Converter *converter, int manif
return 10.0f;
}
#endif
if (!storage->settings.use_creases || storage->cd_edge_crease == NULL) {
if (!storage->settings.use_creases || storage->cd_edge_crease == nullptr) {
return 0.0f;
}
const int edge_index = storage->manifold_edge_index_reverse[manifold_edge_index];
@@ -184,7 +184,7 @@ static bool is_infinite_sharp_vertex(const OpenSubdiv_Converter *converter,
static float get_vertex_sharpness(const OpenSubdiv_Converter *converter, int manifold_vertex_index)
{
ConverterStorage *storage = static_cast<ConverterStorage *>(converter->user_data);
if (!storage->settings.use_creases || storage->cd_vertex_crease == NULL) {
if (!storage->settings.use_creases || storage->cd_vertex_crease == nullptr) {
return 0.0f;
}
const int vertex_index = storage->manifold_vertex_index_reverse[manifold_vertex_index];
@@ -208,7 +208,7 @@ static void precalc_uv_layer(const OpenSubdiv_Converter *converter, const int la
const int num_vert = mesh->totvert;
const float limit[2] = {STD_UV_CONNECT_LIMIT, STD_UV_CONNECT_LIMIT};
/* Initialize memory required for the operations. */
if (storage->loop_uv_indices == NULL) {
if (storage->loop_uv_indices == nullptr) {
storage->loop_uv_indices = static_cast<int *>(
MEM_malloc_arrayN(mesh->totloop, sizeof(int), "loop uv vertex index"));
}
@@ -227,7 +227,7 @@ static void precalc_uv_layer(const OpenSubdiv_Converter *converter, const int la
storage->num_uv_coordinates = -1;
for (int vertex_index = 0; vertex_index < num_vert; vertex_index++) {
const UvMapVert *uv_vert = BKE_mesh_uv_vert_map_get_vert(uv_vert_map, vertex_index);
while (uv_vert != NULL) {
while (uv_vert != nullptr) {
if (uv_vert->separate) {
storage->num_uv_coordinates++;
}
@@ -287,17 +287,17 @@ static void init_functions(OpenSubdiv_Converter *converter)
converter->getNumFaceVertices = get_num_face_vertices;
converter->getFaceVertices = get_face_vertices;
converter->getFaceEdges = NULL;
converter->getFaceEdges = nullptr;
converter->getEdgeVertices = get_edge_vertices;
converter->getNumEdgeFaces = NULL;
converter->getEdgeFaces = NULL;
converter->getNumEdgeFaces = nullptr;
converter->getEdgeFaces = nullptr;
converter->getEdgeSharpness = get_edge_sharpness;
converter->getNumVertexEdges = NULL;
converter->getVertexEdges = NULL;
converter->getNumVertexFaces = NULL;
converter->getVertexFaces = NULL;
converter->getNumVertexEdges = nullptr;
converter->getVertexEdges = nullptr;
converter->getNumVertexFaces = nullptr;
converter->getVertexFaces = nullptr;
converter->isInfiniteSharpVertex = is_infinite_sharp_vertex;
converter->getVertexSharpness = get_vertex_sharpness;
@@ -316,36 +316,36 @@ static void initialize_manifold_index_array(const BLI_bitmap *used_map,
int **r_indices_reverse,
int *r_num_manifold_elements)
{
int *indices = NULL;
if (r_indices != NULL) {
int *indices = nullptr;
if (r_indices != nullptr) {
indices = static_cast<int *>(MEM_malloc_arrayN(num_elements, sizeof(int), "manifold indices"));
}
int *indices_reverse = NULL;
if (r_indices_reverse != NULL) {
int *indices_reverse = nullptr;
if (r_indices_reverse != nullptr) {
indices_reverse = static_cast<int *>(
MEM_malloc_arrayN(num_elements, sizeof(int), "manifold indices reverse"));
}
int offset = 0;
for (int i = 0; i < num_elements; i++) {
if (BLI_BITMAP_TEST_BOOL(used_map, i)) {
if (indices != NULL) {
if (indices != nullptr) {
indices[i] = i - offset;
}
if (indices_reverse != NULL) {
if (indices_reverse != nullptr) {
indices_reverse[i - offset] = i;
}
}
else {
if (indices != NULL) {
if (indices != nullptr) {
indices[i] = -1;
}
offset++;
}
}
if (r_indices != NULL) {
if (r_indices != nullptr) {
*r_indices = indices;
}
if (r_indices_reverse != NULL) {
if (r_indices_reverse != nullptr) {
*r_indices_reverse = indices_reverse;
}
*r_num_manifold_elements = num_elements - offset;
@@ -375,7 +375,7 @@ static void initialize_manifold_indices(ConverterStorage *storage)
&storage->num_manifold_vertices);
initialize_manifold_index_array(edge_used_map,
mesh->totedge,
NULL,
nullptr,
&storage->manifold_edge_index_reverse,
&storage->num_manifold_edges);
/* Initialize infinite sharp mapping. */
@@ -408,7 +408,7 @@ static void init_user_data(OpenSubdiv_Converter *converter,
CustomData_get_layer(&mesh->vdata, CD_CREASE));
user_data->cd_edge_crease = static_cast<const float *>(
CustomData_get_layer(&mesh->edata, CD_CREASE));
user_data->loop_uv_indices = NULL;
user_data->loop_uv_indices = nullptr;
initialize_manifold_indices(user_data);
converter->user_data = user_data;
}

View File

@@ -190,11 +190,6 @@ struct float4x4 {
values[2][2] *= scale;
}
void apply_translation(const float3 &translation)
{
*reinterpret_cast<float3 *>(&values[3][0]) += translation;
}
float4x4 inverted() const
{
float4x4 result;

View File

@@ -252,16 +252,6 @@ template<typename T, int Size>
return result;
}
template<typename T, int Size>
[[nodiscard]] inline VecBase<T, Size> round(const VecBase<T, Size> &a)
{
VecBase<T, Size> result;
for (int i = 0; i < Size; i++) {
result[i] = std::round(a[i]);
}
return result;
}
template<typename T, int Size>
[[nodiscard]] inline VecBase<T, Size> ceil(const VecBase<T, Size> &a)
{

View File

@@ -3911,14 +3911,6 @@ void blo_do_versions_300(FileData *fd, Library * /*lib*/, Main *bmain)
* \note Keep this message at the bottom of the function.
*/
{
if (!DNA_struct_elem_find(fd->filesdna, "SceneEEVEE", "int", "shadow_pool_size")) {
LISTBASE_FOREACH (Scene *, scene, &bmain->scenes) {
scene->eevee.shadow_pool_size = 512;
scene->r.simplify_shadows = 1.0f;
scene->r.simplify_shadows_render = 1.0f;
}
}
/* Keep this block, even when empty. */
}
}

View File

@@ -170,7 +170,6 @@ void BM_mesh_copy_init_customdata_all_layers(BMesh *bm_dst,
const struct BMAllocTemplate *allocsize);
BMesh *BM_mesh_copy(BMesh *bm_old);
#ifdef __cplusplus
}
#endif

View File

@@ -151,7 +151,6 @@ set(SRC
engines/eevee_next/eevee_renderbuffers.cc
engines/eevee_next/eevee_sampling.cc
engines/eevee_next/eevee_shader.cc
engines/eevee_next/eevee_shadow.cc
engines/eevee_next/eevee_sync.cc
engines/eevee_next/eevee_velocity.cc
engines/eevee_next/eevee_view.cc
@@ -282,7 +281,6 @@ set(SRC
engines/eevee_next/eevee_renderbuffers.hh
engines/eevee_next/eevee_sampling.hh
engines/eevee_next/eevee_shader.hh
engines/eevee_next/eevee_shadow.hh
engines/eevee_next/eevee_sync.hh
engines/eevee_next/eevee_velocity.hh
engines/eevee_next/eevee_view.hh
@@ -424,7 +422,6 @@ set(GLSL_SRC
engines/eevee_next/shaders/eevee_camera_lib.glsl
engines/eevee_next/shaders/eevee_colorspace_lib.glsl
engines/eevee_next/shaders/eevee_cryptomatte_lib.glsl
engines/eevee_next/shaders/eevee_transparency_lib.glsl
engines/eevee_next/shaders/eevee_depth_of_field_accumulator_lib.glsl
engines/eevee_next/shaders/eevee_depth_of_field_bokeh_lut_comp.glsl
engines/eevee_next/shaders/eevee_depth_of_field_downsample_comp.glsl
@@ -465,29 +462,10 @@ set(GLSL_SRC
engines/eevee_next/shaders/eevee_motion_blur_lib.glsl
engines/eevee_next/shaders/eevee_nodetree_lib.glsl
engines/eevee_next/shaders/eevee_sampling_lib.glsl
engines/eevee_next/shaders/eevee_shadow_debug_frag.glsl
engines/eevee_next/shaders/eevee_shadow_lib.glsl
engines/eevee_next/shaders/eevee_shadow_page_allocate_comp.glsl
engines/eevee_next/shaders/eevee_shadow_page_clear_comp.glsl
engines/eevee_next/shaders/eevee_shadow_page_defrag_comp.glsl
engines/eevee_next/shaders/eevee_shadow_page_free_comp.glsl
engines/eevee_next/shaders/eevee_shadow_page_mask_comp.glsl
engines/eevee_next/shaders/eevee_shadow_page_ops_lib.glsl
engines/eevee_next/shaders/eevee_shadow_tag_update_comp.glsl
engines/eevee_next/shaders/eevee_shadow_tag_usage_comp.glsl
engines/eevee_next/shaders/eevee_shadow_tag_usage_frag.glsl
engines/eevee_next/shaders/eevee_shadow_tag_usage_lib.glsl
engines/eevee_next/shaders/eevee_shadow_tag_usage_vert.glsl
engines/eevee_next/shaders/eevee_shadow_test.glsl
engines/eevee_next/shaders/eevee_shadow_tilemap_bounds_comp.glsl
engines/eevee_next/shaders/eevee_shadow_tilemap_finalize_comp.glsl
engines/eevee_next/shaders/eevee_shadow_tilemap_init_comp.glsl
engines/eevee_next/shaders/eevee_shadow_tilemap_lib.glsl
engines/eevee_next/shaders/eevee_surf_deferred_frag.glsl
engines/eevee_next/shaders/eevee_surf_depth_frag.glsl
engines/eevee_next/shaders/eevee_surf_forward_frag.glsl
engines/eevee_next/shaders/eevee_surf_lib.glsl
engines/eevee_next/shaders/eevee_surf_shadow_frag.glsl
engines/eevee_next/shaders/eevee_surf_world_frag.glsl
engines/eevee_next/shaders/eevee_velocity_lib.glsl
@@ -819,7 +797,6 @@ if(WITH_GTESTS)
set(TEST_SRC
tests/draw_pass_test.cc
tests/draw_testing.cc
tests/eevee_test.cc
tests/shaders_test.cc
tests/draw_testing.hh

View File

@@ -135,9 +135,8 @@ void Camera::sync()
#endif
}
else if (inst_.drw_view) {
/* \note: Follow camera parameters where distances are positive in front of the camera. */
data.clip_near = -DRW_view_near_distance_get(inst_.drw_view);
data.clip_far = -DRW_view_far_distance_get(inst_.drw_view);
data.clip_near = DRW_view_near_distance_get(inst_.drw_view);
data.clip_far = DRW_view_far_distance_get(inst_.drw_view);
data.fisheye_fov = data.fisheye_lens = -1.0f;
data.equirect_bias = float2(0.0f);
data.equirect_scale = float2(0.0f);
@@ -145,57 +144,6 @@ void Camera::sync()
data_.initialized = true;
data_.push_update();
update_bounds();
}
void Camera::update_bounds()
{
float left, right, bottom, top, near, far;
projmat_dimensions(data_.winmat.ptr(), &left, &right, &bottom, &top, &near, &far);
BoundBox bbox;
bbox.vec[0][2] = bbox.vec[3][2] = bbox.vec[7][2] = bbox.vec[4][2] = -near;
bbox.vec[0][0] = bbox.vec[3][0] = left;
bbox.vec[4][0] = bbox.vec[7][0] = right;
bbox.vec[0][1] = bbox.vec[4][1] = bottom;
bbox.vec[7][1] = bbox.vec[3][1] = top;
/* Get the coordinates of the far plane. */
if (!this->is_orthographic()) {
float sca_far = far / near;
left *= sca_far;
right *= sca_far;
bottom *= sca_far;
top *= sca_far;
}
bbox.vec[1][2] = bbox.vec[2][2] = bbox.vec[6][2] = bbox.vec[5][2] = -far;
bbox.vec[1][0] = bbox.vec[2][0] = left;
bbox.vec[6][0] = bbox.vec[5][0] = right;
bbox.vec[1][1] = bbox.vec[5][1] = bottom;
bbox.vec[2][1] = bbox.vec[6][1] = top;
bound_sphere.center = {0.0f, 0.0f, 0.0f};
bound_sphere.radius = 0.0f;
for (auto i : IndexRange(8)) {
bound_sphere.center += float3(bbox.vec[i]);
}
bound_sphere.center /= 8.0f;
for (auto i : IndexRange(8)) {
float dist_sqr = math::distance_squared(bound_sphere.center, float3(bbox.vec[i]));
bound_sphere.radius = max_ff(bound_sphere.radius, dist_sqr);
}
bound_sphere.radius = sqrtf(bound_sphere.radius);
/* Transform into world space. */
bound_sphere.center = data_.viewinv * bound_sphere.center;
/* Compute diagonal length. */
float2 p0 = float2(bbox.vec[0]) / (this->is_perspective() ? bbox.vec[0][2] : 1.0f);
float2 p1 = float2(bbox.vec[7]) / (this->is_perspective() ? bbox.vec[7][2] : 1.0f);
data_.screen_diagonal_length = math::distance(p0, p1);
}
/** \} */

View File

@@ -84,11 +84,6 @@ class Camera {
CameraDataBuf data_;
struct {
float3 center;
float radius;
} bound_sphere;
public:
Camera(Instance &inst) : inst_(inst){};
~Camera(){};
@@ -128,17 +123,6 @@ class Camera {
{
return *reinterpret_cast<const float3 *>(data_.viewinv[2]);
}
const float3 &bound_center() const
{
return bound_sphere.center;
}
const float &bound_radius() const
{
return bound_sphere.radius;
}
private:
void update_bounds();
};
/** \} */

View File

@@ -32,29 +32,15 @@
* SHADOW_TILEMAP_RES max is 32 because of the shared bitmaps used for LOD tagging.
* It is also limited by the maximum thread group size (1024).
*/
#define SHADOW_TILEMAP_RES 32
#define SHADOW_TILEMAP_LOD 5 /* LOG2(SHADOW_TILEMAP_RES) */
#define SHADOW_TILEMAP_LOD0_LEN ((SHADOW_TILEMAP_RES / 1) * (SHADOW_TILEMAP_RES / 1))
#define SHADOW_TILEMAP_LOD1_LEN ((SHADOW_TILEMAP_RES / 2) * (SHADOW_TILEMAP_RES / 2))
#define SHADOW_TILEMAP_LOD2_LEN ((SHADOW_TILEMAP_RES / 4) * (SHADOW_TILEMAP_RES / 4))
#define SHADOW_TILEMAP_LOD3_LEN ((SHADOW_TILEMAP_RES / 8) * (SHADOW_TILEMAP_RES / 8))
#define SHADOW_TILEMAP_LOD4_LEN ((SHADOW_TILEMAP_RES / 16) * (SHADOW_TILEMAP_RES / 16))
#define SHADOW_TILEMAP_LOD5_LEN ((SHADOW_TILEMAP_RES / 32) * (SHADOW_TILEMAP_RES / 32))
#define SHADOW_TILEMAP_RES 16
#define SHADOW_TILEMAP_LOD 4 /* LOG2(SHADOW_TILEMAP_RES) */
#define SHADOW_TILEMAP_PER_ROW 64
#define SHADOW_TILEDATA_PER_TILEMAP \
(SHADOW_TILEMAP_LOD0_LEN + SHADOW_TILEMAP_LOD1_LEN + SHADOW_TILEMAP_LOD2_LEN + \
SHADOW_TILEMAP_LOD3_LEN + SHADOW_TILEMAP_LOD4_LEN + SHADOW_TILEMAP_LOD5_LEN)
#define SHADOW_PAGE_CLEAR_GROUP_SIZE 32
#define SHADOW_PAGE_RES 256
#define SHADOW_DEPTH_SCAN_GROUP_SIZE 8
#define SHADOW_PAGE_COPY_GROUP_SIZE 32
#define SHADOW_DEPTH_SCAN_GROUP_SIZE 32
#define SHADOW_AABB_TAG_GROUP_SIZE 64
#define SHADOW_MAX_TILEMAP 4096
#define SHADOW_MAX_TILE (SHADOW_MAX_TILEMAP * SHADOW_TILEDATA_PER_TILEMAP)
#define SHADOW_MAX_PAGE 4096
#define SHADOW_PAGE_PER_ROW 64
#define SHADOW_ATLAS_SLOT 5
#define SHADOW_BOUNDS_GROUP_SIZE 64
#define SHADOW_VIEW_MAX 64 /* Must match DRW_VIEW_MAX. */
/* Ray-tracing. */
#define RAYTRACE_GROUP_SIZE 16
@@ -88,11 +74,6 @@
/* Resource bindings. */
/* Texture. */
#define SHADOW_TILEMAPS_TEX_SLOT 12
/* Only during surface shading. */
#define SHADOW_ATLAS_TEX_SLOT 13
/* Only during shadow rendering. */
#define SHADOW_RENDER_MAP_SLOT 13
#define RBUFS_UTILITY_TEX_SLOT 14
/* Images. */
@@ -118,10 +99,7 @@
#define LIGHT_BUF_SLOT 1
#define LIGHT_ZBIN_BUF_SLOT 2
#define LIGHT_TILE_BUF_SLOT 3
/* Only during surface shading. */
#define RBUFS_AOV_BUF_SLOT 5
/* Only during shadow rendering. */
#define SHADOW_PAGE_INFO_SLOT 5
#define SAMPLING_BUF_SLOT 6
#define CRYPTOMATTE_BUF_SLOT 7

View File

@@ -67,7 +67,6 @@ void Instance::init(const int2 &output_res,
film.init(output_res, output_rect);
velocity.init();
depth_of_field.init();
shadows.init();
motion_blur.init();
main_view.init();
}
@@ -103,7 +102,6 @@ void Instance::begin_sync()
materials.begin_sync();
velocity.begin_sync(); /* NOTE: Also syncs camera. */
lights.begin_sync();
shadows.begin_sync();
cryptomatte.begin_sync();
gpencil_engine_enabled = false;
@@ -199,7 +197,6 @@ void Instance::object_sync_render(void *instance_,
void Instance::end_sync()
{
velocity.end_sync();
shadows.end_sync(); /** \note: Needs to be before lights. */
lights.end_sync();
sampling.end_sync();
film.end_sync();

View File

@@ -27,7 +27,6 @@
#include "eevee_renderbuffers.hh"
#include "eevee_sampling.hh"
#include "eevee_shader.hh"
#include "eevee_shadow.hh"
#include "eevee_sync.hh"
#include "eevee_view.hh"
#include "eevee_world.hh"
@@ -47,7 +46,6 @@ class Instance {
SyncModule sync;
MaterialModule materials;
PipelineModule pipelines;
ShadowModule shadows;
LightModule lights;
VelocityModule velocity;
MotionBlurModule motion_blur;
@@ -91,7 +89,6 @@ class Instance {
sync(*this),
materials(*this),
pipelines(*this),
shadows(*this),
lights(*this),
velocity(*this),
motion_blur(*this),

View File

@@ -41,7 +41,7 @@ static eLightType to_light_type(short blender_light_type, short blender_area_typ
/** \name Light Object
* \{ */
void Light::sync(ShadowModule &shadows, const Object *ob, float threshold)
void Light::sync(/* ShadowModule &shadows , */ const Object *ob, float threshold)
{
const ::Light *la = (const ::Light *)ob->data;
float scale[3];
@@ -75,49 +75,67 @@ void Light::sync(ShadowModule &shadows, const Object *ob, float threshold)
this->volume_power = la->volume_fac * point_power;
eLightType new_type = to_light_type(la->type, la->area_shape);
if (assign_if_different(this->type, new_type)) {
shadow_discard_safe(shadows);
if (this->type != new_type) {
/* shadow_discard_safe(shadows); */
this->type = new_type;
}
#if 0
if (la->mode & LA_SHADOW) {
shadow_ensure(shadows);
if (is_sun_light(this->type)) {
this->directional->sync(this->object_mat, 1.0f);
if (la->type == LA_SUN) {
if (this->shadow_id == LIGHT_NO_SHADOW) {
this->shadow_id = shadows.directionals.alloc();
}
ShadowDirectional &shadow = shadows.directionals[this->shadow_id];
shadow.sync(this->object_mat, la->bias * 0.05f, 1.0f);
}
else {
this->punctual->sync(
this->type, this->object_mat, la->spotsize, la->clipsta, this->influence_radius_max);
float cone_aperture = DEG2RAD(360.0);
if (la->type == LA_SPOT) {
cone_aperture = min_ff(DEG2RAD(179.9), la->spotsize);
}
else if (la->type == LA_AREA) {
cone_aperture = DEG2RAD(179.9);
}
if (this->shadow_id == LIGHT_NO_SHADOW) {
this->shadow_id = shadows.punctuals.alloc();
}
ShadowPunctual &shadow = shadows.punctuals[this->shadow_id];
shadow.sync(this->type,
this->object_mat,
cone_aperture,
la->clipsta,
this->influence_radius_max,
la->bias * 0.05f);
}
}
else {
shadow_discard_safe(shadows);
}
#endif
this->initialized = true;
}
#if 0
void Light::shadow_discard_safe(ShadowModule &shadows)
{
if (this->directional != nullptr) {
shadows.directional_pool.destruct(*directional);
this->directional = nullptr;
}
if (this->punctual != nullptr) {
shadows.punctual_pool.destruct(*punctual);
this->punctual = nullptr;
}
}
void Light::shadow_ensure(ShadowModule &shadows)
{
if (is_sun_light(this->type) && this->directional == nullptr) {
this->directional = &shadows.directional_pool.construct(shadows);
}
else if (this->punctual == nullptr) {
this->punctual = &shadows.punctual_pool.construct(shadows);
if (shadow_id != LIGHT_NO_SHADOW) {
if (this->type != LIGHT_SUN) {
shadows.punctuals.free(shadow_id);
}
else {
shadows.directionals.free(shadow_id);
}
shadow_id = LIGHT_NO_SHADOW;
}
}
#endif
/* Returns attenuation radius inverted & squared for easy bound checking inside the shader. */
float Light::attenuation_radius_get(const ::Light *la, float light_threshold, float light_power)
{
if (la->type == LA_SUN) {
@@ -243,14 +261,6 @@ void Light::debug_draw()
/** \name LightModule
* \{ */
LightModule::~LightModule()
{
/* WATCH: Destructor order. Expect shadow module to be destructed later. */
for (Light &light : light_map_.values()) {
light.shadow_discard_safe(inst_.shadows);
}
};
void LightModule::begin_sync()
{
use_scene_lights_ = inst_.use_scene_lights();
@@ -272,44 +282,61 @@ void LightModule::sync_light(const Object *ob, ObjectHandle &handle)
Light &light = light_map_.lookup_or_add_default(handle.object_key);
light.used = true;
if (handle.recalc != 0 || !light.initialized) {
light.initialized = true;
light.sync(inst_.shadows, ob, light_threshold_);
light.sync(/* inst_.shadows, */ ob, light_threshold_);
}
sun_lights_len_ += int(is_sun_light(light.type));
local_lights_len_ += int(!is_sun_light(light.type));
sun_lights_len_ += int(light.type == LIGHT_SUN);
local_lights_len_ += int(light.type != LIGHT_SUN);
}
void LightModule::end_sync()
{
// ShadowModule &shadows = inst_.shadows;
/* NOTE: We resize this buffer before removing deleted lights. */
int lights_allocated = ceil_to_multiple_u(max_ii(light_map_.size(), 1), LIGHT_CHUNK);
light_buf_.resize(lights_allocated);
/* Track light deletion. */
Vector<ObjectKey, 0> deleted_keys;
/* Indices inside GPU data array. */
int sun_lights_idx = 0;
int local_lights_idx = sun_lights_len_;
/* Fill GPU data with scene data. */
auto it_end = light_map_.items().end();
for (auto it = light_map_.items().begin(); it != it_end; ++it) {
Light &light = (*it).value;
for (auto item : light_map_.items()) {
Light &light = item.value;
if (!light.used) {
light_map_.remove(it);
/* Deleted light. */
deleted_keys.append(item.key);
// light.shadow_discard_safe(shadows);
continue;
}
int dst_idx = is_sun_light(light.type) ? sun_lights_idx++ : local_lights_idx++;
int dst_idx = (light.type == LIGHT_SUN) ? sun_lights_idx++ : local_lights_idx++;
/* Put all light data into global data SSBO. */
light_buf_[dst_idx] = light;
#if 0
if (light.shadow_id != LIGHT_NO_SHADOW) {
if (light.type == LIGHT_SUN) {
light_buf_[dst_idx].shadow_data = shadows.directionals[light.shadow_id];
}
else {
light_buf_[dst_idx].shadow_data = shadows.punctuals[light.shadow_id];
}
}
#endif
/* Untag for next sync. */
light.used = false;
}
/* This scene data buffer is then immutable after this point. */
light_buf_.push_update();
for (auto &key : deleted_keys) {
light_map_.remove(key);
}
/* Update sampling on deletion or un-hiding (use_scene_lights). */
if (assign_if_different(light_map_size_, light_map_.size())) {
inst_.sampling.reset();

View File

@@ -34,52 +34,25 @@
namespace blender::eevee {
class Instance;
class ShadowModule;
/* -------------------------------------------------------------------- */
/** \name Light Object
* \{ */
struct Light : public LightData, NonCopyable {
struct Light : public LightData {
public:
bool initialized = false;
bool used = false;
/** Pointers to source Shadow. Type depends on `LightData::type`. */
ShadowDirectional *directional = nullptr;
ShadowPunctual *punctual = nullptr;
public:
Light()
{
/* Avoid valgrind warning. */
this->type = LIGHT_SUN;
shadow_id = LIGHT_NO_SHADOW;
}
/* Only used for debugging. */
#ifndef NDEBUG
Light(Light &&other)
{
*static_cast<LightData *>(this) = other;
this->initialized = other.initialized;
this->used = other.used;
this->directional = other.directional;
this->punctual = other.punctual;
other.directional = nullptr;
other.punctual = nullptr;
}
void sync(/* ShadowModule &shadows, */ const Object *ob, float threshold);
~Light()
{
BLI_assert(directional == nullptr);
BLI_assert(punctual == nullptr);
}
#endif
void sync(ShadowModule &shadows, const Object *ob, float threshold);
void shadow_ensure(ShadowModule &shadows);
void shadow_discard_safe(ShadowModule &shadows);
// void shadow_discard_safe(ShadowModule &shadows);
void debug_draw();
@@ -100,7 +73,7 @@ struct Light : public LightData, NonCopyable {
* The light module manages light data buffers and light culling system.
*/
class LightModule {
friend ShadowModule;
// friend ShadowModule;
private:
/* Keep tile count reasonable for memory usage and 2D culling performance. */
@@ -152,7 +125,7 @@ class LightModule {
public:
LightModule(Instance &inst) : inst_(inst){};
~LightModule();
~LightModule(){};
void begin_sync();
void sync_light(const Object *ob, ObjectHandle &handle);
@@ -165,8 +138,21 @@ class LightModule {
void debug_draw(View &view, GPUFrameBuffer *view_fb);
void bind_resources(DRWShadingGroup *grp)
{
DRW_shgroup_storage_block_ref(grp, "light_buf", &culling_light_buf_);
DRW_shgroup_storage_block_ref(grp, "light_cull_buf", &culling_data_buf_);
DRW_shgroup_storage_block_ref(grp, "light_zbin_buf", &culling_zbin_buf_);
DRW_shgroup_storage_block_ref(grp, "light_tile_buf", &culling_tile_buf_);
#if 0
DRW_shgroup_uniform_texture(grp, "shadow_atlas_tx", inst_.shadows.atlas_tx_get());
DRW_shgroup_uniform_texture(grp, "shadow_tilemaps_tx", inst_.shadows.tilemap_tx_get());
#endif
}
template<typename T> void bind_resources(draw::detail::PassBase<T> *pass)
{
/* Storage Buf. */
pass->bind_ssbo(LIGHT_CULL_BUF_SLOT, &culling_data_buf_);
pass->bind_ssbo(LIGHT_BUF_SLOT, &culling_light_buf_);
pass->bind_ssbo(LIGHT_ZBIN_BUF_SLOT, &culling_zbin_buf_);

View File

@@ -300,9 +300,7 @@ MaterialArray &MaterialModule::material_array_get(Object *ob, bool has_motion)
for (auto i : IndexRange(materials_len)) {
::Material *blender_mat = material_from_slot(ob, i);
Material &mat = material_sync(ob, blender_mat, to_material_geometry(ob), has_motion);
/* \note: Perform a whole copy since next material_sync() can move the Material memory location
* (i.e: because of its container growing) */
material_array_.materials.append(mat);
material_array_.materials.append(&mat);
material_array_.gpu_materials.append(mat.shading.gpumat);
}
return material_array_;

View File

@@ -213,7 +213,7 @@ struct Material {
};
struct MaterialArray {
Vector<Material> materials;
Vector<Material *> materials;
Vector<GPUMaterial *> gpu_materials;
};

View File

@@ -46,8 +46,6 @@ void WorldPipeline::sync(GPUMaterial *gpumat)
world_ps_.bind_image("rp_emission_img", &rbufs.emission_tx);
world_ps_.bind_image("rp_cryptomatte_img", &rbufs.cryptomatte_tx);
world_ps_.bind_ubo(CAMERA_BUF_SLOT, inst_.camera.ubo_get());
world_ps_.draw(DRW_cache_fullscreen_quad_get(), handle);
/* To allow opaque pass rendering over it. */
world_ps_.barrier(GPU_BARRIER_SHADER_IMAGE_ACCESS);
@@ -60,39 +58,6 @@ void WorldPipeline::render(View &view)
/** \} */
/* -------------------------------------------------------------------- */
/** \name Shadow Pipeline
*
* \{ */
void ShadowPipeline::sync()
{
surface_ps_.init();
/* TODO(fclem): Add state for rendering to empty framebuffer without depth test.
* For now this is only here for avoiding the rasterizer discard state. */
surface_ps_.state_set(DRW_STATE_WRITE_DEPTH | DRW_STATE_DEPTH_LESS);
surface_ps_.bind_texture(RBUFS_UTILITY_TEX_SLOT, inst_.pipelines.utility_tx);
surface_ps_.bind_texture(SHADOW_RENDER_MAP_SLOT, &inst_.shadows.render_map_tx_);
surface_ps_.bind_image(SHADOW_ATLAS_SLOT, &inst_.shadows.atlas_tx_);
surface_ps_.bind_ubo(CAMERA_BUF_SLOT, inst_.camera.ubo_get());
surface_ps_.bind_ssbo(SHADOW_PAGE_INFO_SLOT, &inst_.shadows.pages_infos_data_);
inst_.sampling.bind_resources(&surface_ps_);
surface_ps_.framebuffer_set(&inst_.shadows.render_fb_);
}
PassMain::Sub *ShadowPipeline::surface_material_add(GPUMaterial *gpumat)
{
return &surface_ps_.sub(GPU_material_get_name(gpumat));
}
void ShadowPipeline::render(View &view)
{
inst_.manager->submit(surface_ps_, view);
}
/** \} */
/* -------------------------------------------------------------------- */
/** \name Forward Pass
*
@@ -158,7 +123,6 @@ void ForwardPipeline::sync()
opaque_ps_.bind_ubo(CAMERA_BUF_SLOT, inst_.camera.ubo_get());
inst_.lights.bind_resources(&opaque_ps_);
inst_.shadows.bind_resources(&opaque_ps_);
inst_.sampling.bind_resources(&opaque_ps_);
inst_.cryptomatte.bind_resources(&opaque_ps_);
}
@@ -181,10 +145,9 @@ void ForwardPipeline::sync()
/* Textures. */
sub.bind_texture(RBUFS_UTILITY_TEX_SLOT, inst_.pipelines.utility_tx);
/* Uniform Buf. */
sub.bind_ubo(CAMERA_BUF_SLOT, inst_.camera.ubo_get());
opaque_ps_.bind_ubo(CAMERA_BUF_SLOT, inst_.camera.ubo_get());
inst_.lights.bind_resources(&sub);
inst_.shadows.bind_resources(&sub);
inst_.sampling.bind_resources(&sub);
}
}
@@ -262,7 +225,7 @@ void ForwardPipeline::render(View &view,
// inst_.hiz_buffer.update();
// }
inst_.shadows.set_view(view);
// inst_.shadows.set_view(view, depth_tx);
GPU_framebuffer_bind(combined_fb);
inst_.manager->submit(opaque_ps_, view);

View File

@@ -43,28 +43,6 @@ class WorldPipeline {
/** \} */
/* -------------------------------------------------------------------- */
/** \name Shadow Pass
*
* \{ */
class ShadowPipeline {
private:
Instance &inst_;
PassMain surface_ps_ = {"Shadow.Surface"};
public:
ShadowPipeline(Instance &inst) : inst_(inst){};
PassMain::Sub *surface_material_add(GPUMaterial *gpumat);
void sync();
void render(View &view);
};
/** \} */
/* -------------------------------------------------------------------- */
/** \name Forward Pass
*
@@ -199,19 +177,19 @@ class PipelineModule {
WorldPipeline world;
// DeferredPipeline deferred;
ForwardPipeline forward;
ShadowPipeline shadow;
// ShadowPipeline shadow;
// VelocityPipeline velocity;
UtilityTexture utility_tx;
public:
PipelineModule(Instance &inst) : world(inst), forward(inst), shadow(inst){};
PipelineModule(Instance &inst) : world(inst), forward(inst){};
void sync()
{
// deferred.sync();
forward.sync();
shadow.sync();
// shadow.sync();
// velocity.sync();
}
@@ -249,7 +227,8 @@ class PipelineModule {
/* TODO(fclem) volume pass. */
return nullptr;
case MAT_PIPE_SHADOW:
return shadow.surface_material_add(gpumat);
// return shadow.material_add(blender_mat, gpumat);
break;
}
return nullptr;
}

View File

@@ -142,30 +142,6 @@ const char *ShaderModule::static_shader_create_info_name_get(eShaderType shader_
return "eevee_light_culling_tile";
case LIGHT_CULLING_ZBIN:
return "eevee_light_culling_zbin";
case SHADOW_DEBUG:
return "eevee_shadow_debug";
case SHADOW_PAGE_ALLOCATE:
return "eevee_shadow_page_allocate";
case SHADOW_PAGE_CLEAR:
return "eevee_shadow_page_clear";
case SHADOW_PAGE_DEFRAG:
return "eevee_shadow_page_defrag";
case SHADOW_PAGE_FREE:
return "eevee_shadow_page_free";
case SHADOW_PAGE_MASK:
return "eevee_shadow_page_mask";
case SHADOW_TILEMAP_BOUNDS:
return "eevee_shadow_tilemap_bounds";
case SHADOW_TILEMAP_FINALIZE:
return "eevee_shadow_tilemap_finalize";
case SHADOW_TILEMAP_INIT:
return "eevee_shadow_tilemap_init";
case SHADOW_TILEMAP_TAG_UPDATE:
return "eevee_shadow_tag_update";
case SHADOW_TILEMAP_TAG_USAGE_OPAQUE:
return "eevee_shadow_tag_usage_opaque";
case SHADOW_TILEMAP_TAG_USAGE_TRANSPARENT:
return "eevee_shadow_tag_usage_transparent";
/* To avoid compiler warning about missing case. */
case MAX_SHADER_TYPE:
return "";
@@ -222,17 +198,8 @@ void ShaderModule::material_create_info_ammend(GPUMaterial *gpumat, GPUCodegenOu
/* WORKAROUND: Avoid utility texture merge error. TODO: find a cleaner fix. */
for (auto &resource : info.batch_resources_) {
if (resource.bind_type == ShaderCreateInfo::Resource::BindType::SAMPLER) {
switch (resource.slot) {
case RBUFS_UTILITY_TEX_SLOT:
resource.slot = GPU_max_textures_frag() - 1;
break;
// case SHADOW_RENDER_MAP_SLOT: /* Does not compile because it is a define. */
case SHADOW_ATLAS_TEX_SLOT:
resource.slot = GPU_max_textures_frag() - 2;
break;
case SHADOW_TILEMAPS_TEX_SLOT:
resource.slot = GPU_max_textures_frag() - 3;
break;
if (resource.slot == RBUFS_UTILITY_TEX_SLOT) {
resource.slot = GPU_max_textures_frag() - 1;
}
}
}
@@ -247,10 +214,9 @@ void ShaderModule::material_create_info_ammend(GPUMaterial *gpumat, GPUCodegenOu
if (GPU_material_flag_get(gpumat, GPU_MATFLAG_TRANSPARENT) == false &&
pipeline_type == MAT_PIPE_FORWARD) {
/* Opaque forward do support AOVs and render pass if not using transparency. */
/* Opaque forward do support AOVs and render pass. */
info.additional_info("eevee_aov_out");
info.additional_info("eevee_render_pass_out");
info.additional_info("eevee_cryptomatte_out");
}
if (GPU_material_flag_get(gpumat, GPU_MATFLAG_BARYCENTRIC)) {
@@ -423,10 +389,8 @@ void ShaderModule::material_create_info_ammend(GPUMaterial *gpumat, GPUCodegenOu
break;
case MAT_PIPE_FORWARD_PREPASS:
case MAT_PIPE_DEFERRED_PREPASS:
info.additional_info("eevee_surf_depth");
break;
case MAT_PIPE_SHADOW:
info.additional_info("eevee_surf_shadow");
info.additional_info("eevee_surf_depth");
break;
case MAT_PIPE_DEFERRED:
info.additional_info("eevee_surf_deferred");

View File

@@ -62,19 +62,6 @@ enum eShaderType {
MOTION_BLUR_TILE_FLATTEN_RENDER,
MOTION_BLUR_TILE_FLATTEN_VIEWPORT,
SHADOW_DEBUG,
SHADOW_PAGE_ALLOCATE,
SHADOW_PAGE_CLEAR,
SHADOW_PAGE_DEFRAG,
SHADOW_PAGE_FREE,
SHADOW_PAGE_MASK,
SHADOW_TILEMAP_BOUNDS,
SHADOW_TILEMAP_FINALIZE,
SHADOW_TILEMAP_INIT,
SHADOW_TILEMAP_TAG_UPDATE,
SHADOW_TILEMAP_TAG_USAGE_OPAQUE,
SHADOW_TILEMAP_TAG_USAGE_TRANSPARENT,
MAX_SHADER_TYPE,
};

View File

@@ -21,9 +21,6 @@
namespace blender::eevee {
struct ShadowDirectional;
struct ShadowPunctual;
using namespace draw;
constexpr eGPUSamplerState no_filter = GPU_SAMPLER_DEFAULT;
@@ -49,21 +46,36 @@ enum eDebugMode : uint32_t {
*/
DEBUG_HIZ_VALIDATION = 2u,
/**
* Show tiles depending on their status.
* Tile-maps to screen. Is also present in other modes.
* - Black pixels, no pages allocated.
* - Green pixels, pages cached.
* - Red pixels, pages allocated.
*/
DEBUG_SHADOW_TILEMAPS = 10u,
/**
* Show content of shadow map. Used to verify projection code.
* Random color per pages. Validates page density allocation and sampling.
*/
DEBUG_SHADOW_VALUES = 11u,
DEBUG_SHADOW_PAGES = 11u,
/**
* Show random color for each tile. Verify allocation and LOD assignment.
* Outputs random color per tile-map (or tile-map level). Validates tile-maps coverage.
* Black means not covered by any tile-maps LOD of the shadow.
*/
DEBUG_SHADOW_TILE_RANDOM_COLOR = 12u,
DEBUG_SHADOW_LOD = 12u,
/**
* Show random color for each tile. Verify distribution and LOD transitions.
* Outputs white pixels for pages allocated and black pixels for unused pages.
* This needs DEBUG_SHADOW_PAGE_ALLOCATION_ENABLED defined in order to work.
*/
DEBUG_SHADOW_TILEMAP_RANDOM_COLOR = 13u,
DEBUG_SHADOW_PAGE_ALLOCATION = 13u,
/**
* Outputs the tile-map atlas. Default tile-map is too big for the usual screen resolution.
* Try lowering SHADOW_TILEMAP_PER_ROW and SHADOW_MAX_TILEMAP before using this option.
*/
DEBUG_SHADOW_TILE_ALLOCATION = 14u,
/**
* Visualize linear depth stored in the atlas regions of the active light.
* This way, one can check if the rendering, the copying and the shadow sampling functions works.
*/
DEBUG_SHADOW_SHADOW_DEPTH = 15u
};
/** \} */
@@ -164,11 +176,6 @@ struct CameraData {
float clip_near;
float clip_far;
eCameraType type;
/** World space distance between view corners at unit distance from camera. */
float screen_diagonal_length;
float _pad0;
float _pad1;
float _pad2;
bool1 initialized;
@@ -494,7 +501,8 @@ static inline float regular_polygon_side_length(float sides_count)
* Start first corners at theta == 0. */
static inline float circle_to_polygon_radius(float sides_count, float theta)
{
/* From Graphics Gems from CryENGINE 3 (Siggraph 2013) by Tiago Sousa (slide 36). */
/* From Graphics Gems from CryENGINE 3 (Siggraph 2013) by Tiago Sousa (slide
* 36). */
float side_angle = (2.0f * M_PI) / sides_count;
return cosf(side_angle * 0.5f) /
cosf(theta - side_angle * floorf((sides_count * theta + M_PI) / (2.0f * M_PI)));
@@ -574,11 +582,10 @@ BLI_STATIC_ASSERT_ALIGN(LightCullingData, 16)
enum eLightType : uint32_t {
LIGHT_SUN = 0u,
LIGHT_SUN_ORTHO = 1u,
LIGHT_POINT = 10u,
LIGHT_SPOT = 11u,
LIGHT_RECT = 20u,
LIGHT_ELLIPSE = 21u
LIGHT_POINT = 1u,
LIGHT_SPOT = 2u,
LIGHT_RECT = 3u,
LIGHT_ELLIPSE = 4u
};
static inline bool is_area_light(eLightType type)
@@ -586,11 +593,6 @@ static inline bool is_area_light(eLightType type)
return type >= LIGHT_RECT;
}
static inline bool is_sun_light(eLightType type)
{
return type < LIGHT_POINT;
}
struct LightData {
/** Normalized object matrix. Last column contains data accessible using the following macros. */
float4x4 object_mat;
@@ -600,9 +602,6 @@ struct LightData {
#define _radius _area_size_x
#define _spot_mul object_mat[2][3]
#define _spot_bias object_mat[3][3]
/** Scale to convert from world units to tile space of the clipmap_lod_max. */
#define _clipmap_origin_x object_mat[2][3]
#define _clipmap_origin_y object_mat[3][3]
/** Aliases for axes. */
#ifndef USE_GPU_SHADER_CREATE_INFO
# define _right object_mat[0]
@@ -615,210 +614,34 @@ struct LightData {
# define _back object_mat[2].xyz
# define _position object_mat[3].xyz
#endif
/** Punctual : Influence radius (inverted and squared) adjusted for Surface / Volume power. */
/** Influence radius (inverted and squared) adjusted for Surface / Volume power. */
float influence_radius_invsqr_surface;
float influence_radius_invsqr_volume;
/** Punctual : Maximum influence radius. Used for culling. Equal to clip far distance. */
/** Maximum influence radius. Used for culling. */
float influence_radius_max;
/** Special radius factor for point lighting. */
float radius_squared;
/** Index of the shadow struct on CPU. -1 means no shadow. */
int shadow_id;
/** NOTE: It is ok to use float3 here. A float is declared right after it.
* float3 is also aligned to 16 bytes. */
float3 color;
/** Light Type. */
eLightType type;
/** Spot size. Aligned to size of float2. */
float2 spot_size_inv;
/** Spot angle tangent. */
float spot_tan;
/** Reuse for directionnal lod bias. */
#define _clipmap_lod_bias spot_tan
/** Power depending on shader type. */
float diffuse_power;
float specular_power;
float volume_power;
float transmit_power;
/** --- Shadow Data --- */
/** Directional : Near clip distance. Float stored as int for atomic operations. */
int clip_near;
int clip_far;
/** Directional : Clip-map lod range to avoid sampling outside of valid range. */
int clipmap_lod_min;
int clipmap_lod_max;
/** Index of the first tile-map. */
int tilemap_index;
/** Directional : Offset of the lod min in lod min tile units. */
int2 clipmap_base_offset;
/** Punctual & Directional : Normal matrix packed for automatic bias. */
float2 normal_mat_packed;
/** Special radius factor for point lighting. */
float radius_squared;
/** Light Type. */
eLightType type;
/** Spot angle tangent. */
float spot_tan;
/** Spot size. Aligned to size of float2. */
float2 spot_size_inv;
/** Associated shadow data. Only valid if shadow_id is not LIGHT_NO_SHADOW. */
// ShadowData shadow_data;
};
BLI_STATIC_ASSERT_ALIGN(LightData, 16)
static inline int light_tilemap_max_get(LightData light)
{
/* This is not something we need in performance critical code. */
return light.tilemap_index + (light.clipmap_lod_max - light.clipmap_lod_min);
}
/** \} */
/* -------------------------------------------------------------------- */
/** \name Shadows
*
* Shadow data for either a directional shadow or a punctual shadow.
*
* A punctual shadow is composed of 1, 5 or 6 shadow regions.
* Regions are sorted in this order -Z, +X, -X, +Y, -Y, +Z.
* Face index is computed from light's object space coordinates.
*
* A directional light shadow is composed of multiple clip-maps with each level
* covering twice as much area as the previous one.
* \{ */
enum eShadowProjectionType : uint32_t {
SHADOW_PROJECTION_CUBEFACE = 0u,
SHADOW_PROJECTION_CLIPMAP = 1u,
SHADOW_PROJECTION_CASCADE = 2u,
};
static inline int2 shadow_cascade_grid_offset(int2 base_offset, int level_relative)
{
return (base_offset * level_relative) / (1 << 16);
}
/**
* Small descriptor used for the tile update phase. Updated by CPU & uploaded to GPU each redraw.
*/
struct ShadowTileMapData {
/** Cached, used for rendering. */
float4x4 viewmat, winmat;
/** Punctual : Corners of the frustum. (vec3 padded to vec4) */
float4 corners[4];
/** Integer offset of the center of the 16x16 tiles from the origin of the tile space. */
int2 grid_offset;
/** Shift between previous and current grid_offset. Allows update tagging. */
int2 grid_shift;
/** True for punctual lights. */
eShadowProjectionType projection_type;
/** Multiple of SHADOW_TILEDATA_PER_TILEMAP. Offset inside the tile buffer. */
int tiles_index;
/** Index of persistent data in the persistent data buffer. */
int clip_data_index;
/** Bias LOD to tag for usage to lower the amount of tile used. */
float lod_bias;
};
BLI_STATIC_ASSERT_ALIGN(ShadowTileMapData, 16)
/**
* Per tilemap data persistent on GPU.
*/
struct ShadowTileMapClip {
/** Clip distances that were used to render the pages. */
float clip_near_stored;
float clip_far_stored;
/** Near and far clip distances for directional. Float stored as int for atomic operations. */
int clip_near;
int clip_far;
};
BLI_STATIC_ASSERT_ALIGN(ShadowTileMapClip, 16)
struct ShadowPagesInfoData {
/** Number of free pages in the free page buffer. */
int page_free_count;
/** Number of page allocations needed for this cycle. */
int page_alloc_count;
/** Index of the next cache page in the cached page buffer. */
uint page_cached_next;
/** Index of the first page in the buffer since the last defrag. */
uint page_cached_start;
/** Index of the last page in the buffer since the last defrag. */
uint page_cached_end;
/** Number of views to be rendered during the shadow update pass. */
int view_count;
/** Physical page size in pixel. Pages are all squares. */
int page_size;
int _pad0;
};
BLI_STATIC_ASSERT_ALIGN(ShadowPagesInfoData, 16)
struct ShadowStatistics {
/** Statistics that are read back to CPU after a few frame (to avoid stall). */
int page_used_count;
int page_update_count;
int page_allocated_count;
int page_rendered_count;
};
BLI_STATIC_ASSERT_ALIGN(ShadowStatistics, 16)
/** Decoded tile data structure. */
struct ShadowTileData {
/** Page inside the virtual shadow map atlas. */
uint2 page;
/** Page index inside pages_cached_buf. Only valid if `is_cached` is true. */
uint cache_index;
/** Lod pointed to LOD 0 tile page. (cubemap only) */
uint lod;
/** If the tile is needed for rendering. */
bool is_used;
/** True if an update is needed. This persists even if the tile gets unused. */
bool do_update;
/** True if the tile owns the page (mutually exclusive with `is_cached`). */
bool is_allocated;
/** True if the tile has been staged for rendering. This will remove the `do_update` flag. */
bool is_rendered;
/** True if the tile is inside the pages_cached_buf (mutually exclusive with `is_allocated`). */
bool is_cached;
};
/** \note Stored packed as a uint. */
#define ShadowTileDataPacked uint
enum eShadowFlag : uint32_t {
SHADOW_NO_DATA = 0u,
SHADOW_IS_CACHED = (1u << 27u),
SHADOW_IS_ALLOCATED = (1u << 28u),
SHADOW_DO_UPDATE = (1u << 29u),
SHADOW_IS_RENDERED = (1u << 30u),
SHADOW_IS_USED = (1u << 31u)
};
static inline ShadowTileData shadow_tile_unpack(ShadowTileDataPacked data)
{
ShadowTileData tile;
/* Tweaked for SHADOW_PAGE_PER_ROW = 64. */
tile.page.x = data & 63u;
tile.page.y = (data >> 6u) & 63u;
/* -- 12 bits -- */
/* Tweaked for SHADOW_TILEMAP_LOD < 8. */
tile.lod = (data >> 12u) & 7u;
/* -- 15 bits -- */
/* Tweaked for SHADOW_MAX_TILEMAP = 4096. */
tile.cache_index = (data >> 15u) & 4095u;
/* -- 27 bits -- */
tile.is_used = (data & SHADOW_IS_USED) != 0;
tile.is_cached = (data & SHADOW_IS_CACHED) != 0;
tile.is_allocated = (data & SHADOW_IS_ALLOCATED) != 0;
tile.is_rendered = (data & SHADOW_IS_RENDERED) != 0;
tile.do_update = (data & SHADOW_DO_UPDATE) != 0;
return tile;
}
static inline ShadowTileDataPacked shadow_tile_pack(ShadowTileData tile)
{
uint data;
data = (tile.page.x & 63u);
data |= (tile.page.y & 63u) << 6u;
data |= (tile.lod & 7u) << 12u;
data |= (tile.cache_index & 4095u) << 15u;
data |= (tile.is_used ? SHADOW_IS_USED : 0);
data |= (tile.is_allocated ? SHADOW_IS_ALLOCATED : 0);
data |= (tile.is_cached ? SHADOW_IS_CACHED : 0);
data |= (tile.is_rendered ? SHADOW_IS_RENDERED : 0);
data |= (tile.do_update ? SHADOW_DO_UPDATE : 0);
return data;
}
/** \} */
/* -------------------------------------------------------------------- */
@@ -938,13 +761,6 @@ using LightDataBuf = draw::StorageArrayBuffer<LightData, LIGHT_CHUNK>;
using MotionBlurDataBuf = draw::UniformBuffer<MotionBlurData>;
using MotionBlurTileIndirectionBuf = draw::StorageBuffer<MotionBlurTileIndirection, true>;
using SamplingDataBuf = draw::StorageBuffer<SamplingData>;
using ShadowStatisticsBuf = draw::StorageBuffer<ShadowStatistics>;
using ShadowPagesInfoDataBuf = draw::StorageBuffer<ShadowPagesInfoData>;
using ShadowPageHeapBuf = draw::StorageVectorBuffer<uint, SHADOW_MAX_PAGE>;
using ShadowPageCacheBuf = draw::StorageArrayBuffer<uint2, SHADOW_MAX_PAGE, true>;
using ShadowTileMapDataBuf = draw::StorageVectorBuffer<ShadowTileMapData, SHADOW_MAX_TILEMAP>;
using ShadowTileMapClipBuf = draw::StorageArrayBuffer<ShadowTileMapClip, SHADOW_MAX_TILEMAP, true>;
using ShadowTileDataBuf = draw::StorageArrayBuffer<ShadowTileDataPacked, SHADOW_MAX_TILE, true>;
using VelocityGeometryBuf = draw::StorageArrayBuffer<float4, 16, true>;
using VelocityIndexBuf = draw::StorageArrayBuffer<VelocityIndex, 16>;
using VelocityObjectBuf = draw::StorageArrayBuffer<float4x4, 16>;

File diff suppressed because it is too large Load Diff

View File

@@ -1,449 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0-or-later
* Copyright 2022 Blender Foundation.
*/
/** \file
* \ingroup eevee
*
* The shadow module manages shadow update tagging & shadow rendering.
*/
#pragma once
#include "BLI_pool.hh"
#include "BLI_vector.hh"
#include "GPU_batch.h"
#include "eevee_material.hh"
#include "eevee_shader.hh"
#include "eevee_shader_shared.hh"
namespace blender::eevee {
class Instance;
class ShadowModule;
class ShadowPipeline;
struct Light;
enum eCubeFace {
/* Ordering by culling order. If cone aperture is shallow, we cull the later view. */
Z_NEG = 0,
X_POS,
X_NEG,
Y_POS,
Y_NEG,
Z_POS,
};
/* To be applied after view matrix. Follow same order as eCubeFace. */
constexpr static const float shadow_face_mat[6][4][4] = {
{{1, 0, 0, 0}, {0, 1, 0, 0}, {0, 0, 1, 0}, {0, 0, 0, 1}}, /* Z_NEG */
{{0, 0, -1, 0}, {-1, 0, 0, 0}, {0, 1, 0, 0}, {0, 0, 0, 1}}, /* X_POS */
{{0, 0, 1, 0}, {1, 0, 0, 0}, {0, 1, 0, 0}, {0, 0, 0, 1}}, /* X_NEG */
{{1, 0, 0, 0}, {0, 0, -1, 0}, {0, 1, 0, 0}, {0, 0, 0, 1}}, /* Y_POS */
{{-1, 0, 0, 0}, {0, 0, 1, 0}, {0, 1, 0, 0}, {0, 0, 0, 1}}, /* Y_NEG */
{{1, 0, 0, 0}, {0, -1, 0, 0}, {0, 0, -1, 0}, {0, 0, 0, 1}}, /* Z_POS */
};
/* Converts to [-SHADOW_TILEMAP_RES / 2..SHADOW_TILEMAP_RES / 2] for XY and [0..1] for Z. */
constexpr static const float shadow_clipmap_scale_mat[4][4] = {{SHADOW_TILEMAP_RES / 2, 0, 0, 0},
{0, SHADOW_TILEMAP_RES / 2, 0, 0},
{0, 0, 0.5, 0},
{0, 0, 0.5, 1}};
/* -------------------------------------------------------------------- */
/** \name Tile-Map
*
* Stores indirection table and states of each tile of a virtual shadow-map.
* One tile-map has the effective resolution of `pagesize * tile_map_resolution`.
* Each tile-map overhead is quite small if they do not have any pages allocated.
*
* \{ */
struct ShadowTileMap : public ShadowTileMapData {
static constexpr int64_t tile_map_resolution = SHADOW_TILEMAP_RES;
static constexpr int64_t tiles_count = tile_map_resolution * tile_map_resolution;
/** Level of detail for clipmap. */
int level = INT_MAX;
/** Cube face index. */
eCubeFace cubeface = Z_NEG;
/** Cached, used for detecting updates. */
float4x4 object_mat;
/** Near and far clip distances. For clip-map, computed on the GPU using casters BBoxes. */
float near, far;
public:
ShadowTileMap(int tiles_index_)
{
tiles_index = tiles_index_;
/* For now just the same index. */
clip_data_index = tiles_index_ / SHADOW_TILEDATA_PER_TILEMAP;
this->set_dirty();
}
void sync_orthographic(const float4x4 &object_mat_,
int2 origin_offset,
int clipmap_level,
float lod_bias_,
eShadowProjectionType projection_type_);
void sync_cubeface(
const float4x4 &object_mat, float near, float far, eCubeFace face, float lod_bias_);
void debug_draw() const;
void set_dirty()
{
grid_shift = int2(SHADOW_TILEMAP_RES);
}
void set_updated()
{
grid_shift = int2(0);
}
};
/**
* The tile-maps are managed on CPU and associated with each light shadow object.
*
* The number of tile-maps & tiles is unbounded (to the limit of SSBOs), but the actual number
* used for rendering is caped to 4096. This is to simplify tile-maps management on CPU.
*
* At sync end, all tile-maps are grouped by light inside the ShadowTileMapDataBuf so that each
* light has a contiguous range of tile-maps to refer to.
*/
struct ShadowTileMapPool {
public:
/** Limit the width of the texture. */
static constexpr int64_t maps_per_row = SHADOW_TILEMAP_PER_ROW;
/** Vector containing available offset to tile range in the ShadowTileDataBuf. */
Vector<uint> free_indices;
/** Pool containing shadow tile structure on CPU. */
Pool<ShadowTileMap> tilemap_pool;
/** Sorted descriptions for each tilemap in the pool. Updated each frame. */
ShadowTileMapDataBuf tilemaps_data = {"tilemaps_data"};
/** Previously used tile-maps that needs to release their tiles/pages. Updated each frame. */
ShadowTileMapDataBuf tilemaps_unused = {"tilemaps_unused"};
/** All possible tiles. A range of tiles tile is referenced by a tile-map. */
ShadowTileDataBuf tiles_data = {"tiles_data"};
/** Clip range for directional shadows. Updated on GPU. Persistent. */
ShadowTileMapClipBuf tilemaps_clip = {"tilemaps_clip"};
/** Texture equivalent of ShadowTileDataBuf but grouped by light. */
Texture tilemap_tx = {"tilemap_tx"};
/** Number of free tile-maps at the end of the previous sync. */
int64_t last_free_len = 0;
public:
ShadowTileMapPool();
ShadowTileMap *acquire();
/**
* Push the given list of ShadowTileMap onto the free stack. Their pages will be free.
*/
void release(Span<ShadowTileMap *> free_list);
void end_sync(ShadowModule &module);
};
/** \} */
/* -------------------------------------------------------------------- */
/** \name Shadow Casters & Receivers
*
* \{ */
/* Can be either a shadow caster or a shadow receiver. */
struct ShadowObject {
ResourceHandle resource_handle = {0};
bool used = true;
};
/** \} */
/* -------------------------------------------------------------------- */
/** \name ShadowModule
*
* Manages shadow atlas and shadow region data.
* \{ */
class ShadowModule {
friend ShadowPunctual;
friend ShadowDirectional;
friend ShadowPipeline;
friend ShadowTileMapPool;
public:
/** Need to be first because of destructor order. */
ShadowTileMapPool tilemap_pool;
Pool<ShadowPunctual> punctual_pool;
Pool<ShadowDirectional> directional_pool;
private:
Instance &inst_;
/** Map of shadow casters to track deletion & update of intersected shadows. */
Map<ObjectKey, ShadowObject> objects_;
/* -------------------------------------------------------------------- */
/** \name Tilemap Management
* \{ */
PassSimple tilemap_setup_ps_ = {"TilemapSetup"};
PassMain tilemap_usage_ps_ = {"TagUsage"};
PassSimple tilemap_update_ps_ = {"TilemapUpdate"};
PassMain::Sub *tilemap_usage_transparent_ps_ = nullptr;
GPUBatch *box_batch_ = nullptr;
Framebuffer usage_tag_fb;
/** List of Resource IDs (to get bounds) for tagging passes. */
StorageVectorBuffer<uint, 128> past_casters_updated_ = {"PastCastersUpdated"};
StorageVectorBuffer<uint, 128> curr_casters_updated_ = {"CurrCastersUpdated"};
/** List of Resource IDs (to get bounds) for getting minimum clip-maps bounds. */
StorageVectorBuffer<uint, 128> curr_casters_ = {"CurrCasters"};
/** Indirect arguments for page clearing. */
StorageBuffer<DispatchCommand> clear_dispatch_buf_;
/** Pages to clear. */
StorageArrayBuffer<uint, SHADOW_MAX_PAGE> clear_page_buf_ = {"clear_page_buf"};
int3 dispatch_depth_scan_size_;
/* Ratio between tilemap pixel world "radius" and film pixel world "radius". */
float tilemap_projection_ratio_;
/* Statistics that are read back to CPU after a few frame (to avoid stall). */
SwapChain<ShadowStatisticsBuf, 5> statistics_buf_;
/** \} */
/* -------------------------------------------------------------------- */
/** \name Page Management
* \{ */
static constexpr eGPUTextureFormat atlas_type = GPU_R32UI;
/** Atlas containing all physical pages. */
Texture atlas_tx_ = {"shadow_atlas_tx_"};
/** Pool of unallocated pages waiting to be assigned to specific tiles in the tilemap atlas. */
ShadowPageHeapBuf pages_free_data_ = {"PagesFreeBuf"};
/** Pool of cached tiles waiting to be reused. */
ShadowPageCacheBuf pages_cached_data_ = {"PagesCachedBuf"};
/** Infos for book keeping and debug. */
ShadowPagesInfoDataBuf pages_infos_data_ = {"PagesInfosBuf"};
int3 copy_dispatch_size_;
int3 scan_dispatch_size_;
int rendering_tilemap_;
int rendering_lod_;
bool do_full_update = true;
/** \} */
/* -------------------------------------------------------------------- */
/** \name Rendering
* \{ */
/** Multi-View containing a maximum of 64 view to be rendered with the shadow pipeline. */
View shadow_multi_view_ = {"ShadowMultiView", SHADOW_VIEW_MAX, true};
/** Tile to physical page mapping. This is an array texture with one layer per view. */
Texture render_map_tx_ = {"ShadowRenderMap",
GPU_R32UI,
GPU_TEXTURE_USAGE_SHADER_READ | GPU_TEXTURE_USAGE_SHADER_WRITE,
int2(SHADOW_TILEMAP_RES),
64,
nullptr,
SHADOW_TILEMAP_LOD + 1};
/** An empty frame-buffer (no attachment) the size of a whole tilemap. */
Framebuffer render_fb_;
/** \} */
/* -------------------------------------------------------------------- */
/** \name Debugging
* \{ */
/** Display informations about the virtual shadows. */
PassSimple debug_draw_ps_ = {"Shadow.Debug"};
/** \} */
/** Scene immutable parameters. */
/** For now, needs to be hardcoded. */
int shadow_page_size_ = SHADOW_PAGE_RES;
/** Amount of bias to apply to the LOD computed at the tile usage tagging stage. */
float lod_bias_ = 0.0f;
/** Maximum number of allocated pages. Maximum value is SHADOW_MAX_TILEMAP. */
int shadow_page_len_ = SHADOW_MAX_TILEMAP;
public:
ShadowModule(Instance &inst);
~ShadowModule(){};
void init();
void begin_sync();
/** Register a shadow caster or receiver. */
void sync_object(const ObjectHandle &handle,
const ResourceHandle &resource_handle,
bool is_shadow_caster,
bool is_alpha_blend);
void end_sync();
void set_lights_data();
void set_view(View &view);
void debug_end_sync();
void debug_draw(View &view, GPUFrameBuffer *view_fb);
template<typename T> void bind_resources(draw::detail::PassBase<T> *pass)
{
pass->bind_texture(SHADOW_ATLAS_TEX_SLOT, &atlas_tx_);
pass->bind_texture(SHADOW_TILEMAPS_TEX_SLOT, &tilemap_pool.tilemap_tx);
}
private:
void remove_unused();
void debug_page_map_call(DRWPass *pass);
/** Compute approximate screen pixel space radius. */
float screen_pixel_radius(const View &view, const int2 &extent);
/** Compute approximate punctual shadow pixel world space radius, 1 unit away of the light. */
float tilemap_pixel_radius();
};
/** \} */
/* -------------------------------------------------------------------- */
/** \name Shadow
*
* A shadow component is associated to a `eevee::Light` and manages its associated Tile-maps.
* \{ */
class ShadowPunctual : public NonCopyable, NonMovable {
private:
ShadowModule &shadows_;
/** Tile-map for each cube-face needed (in eCubeFace order). */
Vector<ShadowTileMap *> tilemaps_;
/** Area light size. */
float size_x_, size_y_;
/** Shape type. */
eLightType light_type_;
/** Random position on the light. In world space. */
float3 random_offset_;
/** Light position. */
float3 position_;
/** Near and far clip distances. */
float far_, near_;
/** Number of tile-maps needed to cover the light angular extents. */
int tilemaps_needed_;
/** Visibility cone angle from the light source. */
int cone_aperture_;
public:
ShadowPunctual(ShadowModule &module) : shadows_(module){};
ShadowPunctual(ShadowPunctual &&other)
: shadows_(other.shadows_), tilemaps_(std::move(other.tilemaps_)){};
~ShadowPunctual()
{
shadows_.tilemap_pool.release(tilemaps_);
}
/**
* Sync shadow parameters but do not allocate any shadow tile-maps.
*/
void sync(eLightType light_type,
const float4x4 &object_mat,
float cone_aperture,
float near_clip,
float far_clip);
/**
* Release the tile-maps that will not be used in the current frame.
*/
void release_excess_tilemaps();
/**
* Allocate shadow tile-maps and setup views for rendering.
*/
void end_sync(Light &light, float lod_bias);
};
class ShadowDirectional : public NonCopyable, NonMovable {
private:
ShadowModule &shadows_;
/** Tile-map for each clip-map level. */
Vector<ShadowTileMap *> tilemaps_;
/** User minimum resolution. */
float min_resolution_;
/** Copy of object matrix. Normalized. */
float4x4 object_mat_;
/** Current range of clip-map / cascades levels covered by this shadow. */
IndexRange levels_range;
public:
ShadowDirectional(ShadowModule &module) : shadows_(module){};
ShadowDirectional(ShadowDirectional &&other)
: shadows_(other.shadows_), tilemaps_(std::move(other.tilemaps_)){};
~ShadowDirectional()
{
shadows_.tilemap_pool.release(tilemaps_);
}
/**
* Sync shadow parameters but do not allocate any shadow tile-maps.
*/
void sync(const float4x4 &object_mat, float min_resolution);
/**
* Release the tile-maps that will not be used in the current frame.
*/
void release_excess_tilemaps(const Camera &camera, float lod_bias);
/**
* Allocate shadow tile-maps and setup views for rendering.
*/
void end_sync(Light &light, const Camera &camera, float lod_bias);
/* Return coverage of the whole tilemap in world unit. */
static float coverage_get(int lvl)
{
/* This function should be kept in sync with shadow_directional_level(). */
/* \note: If we would to introduce a global scaling option it would be here. */
return exp2(lvl);
}
/* Return coverage of a single tile for a tilemap of this LOD in world unit. */
static float tile_size_get(int lvl)
{
return coverage_get(lvl) / SHADOW_TILEMAP_RES;
}
private:
IndexRange clipmap_level_range(const Camera &camera);
IndexRange cascade_level_range(const Camera &camera, float lod_bias);
void cascade_tilemaps_distribution(Light &light, const Camera &camera);
void clipmap_tilemaps_distribution(Light &light, const Camera &camera, float lod_bias);
void cascade_tilemaps_distribution_near_far_points(const Camera &camera,
float3 &near_point,
float3 &far_point);
/* Choose between clipmap and cascade distribution of shadowmap precision depending on the camera
* projection type and bounds. */
static eShadowProjectionType directional_distribution_type_get(const Camera &camera);
};
/** \} */
} // namespace blender::eevee

View File

@@ -41,7 +41,6 @@ ObjectHandle &SyncModule::sync_object(Object *ob)
ObjectHandle &eevee_dd = *reinterpret_cast<ObjectHandle *>(dd);
if (eevee_dd.object_key.ob == nullptr) {
ob = DEG_get_original_object(ob);
eevee_dd.object_key = ObjectKey(ob);
}
@@ -49,6 +48,7 @@ ObjectHandle &SyncModule::sync_object(Object *ob)
ID_RECALC_GEOMETRY;
if ((eevee_dd.recalc & recalc_flags) != 0) {
inst_.sampling.reset();
UNUSED_VARS(inst_);
}
return eevee_dd;
@@ -127,13 +127,13 @@ void SyncModule::sync_mesh(Object *ob,
if (geom == nullptr) {
continue;
}
Material &material = material_array.materials[i];
geometry_call(material.shading.sub_pass, geom, res_handle);
geometry_call(material.prepass.sub_pass, geom, res_handle);
geometry_call(material.shadow.sub_pass, geom, res_handle);
Material *material = material_array.materials[i];
geometry_call(material->shading.sub_pass, geom, res_handle);
geometry_call(material->prepass.sub_pass, geom, res_handle);
geometry_call(material->shadow.sub_pass, geom, res_handle);
is_shadow_caster = is_shadow_caster || material.shadow.sub_pass != nullptr;
is_alpha_blend = is_alpha_blend || material.is_alpha_blend_transparent;
is_shadow_caster = is_shadow_caster || material->shadow.sub_pass != nullptr;
is_alpha_blend = is_alpha_blend || material->is_alpha_blend_transparent;
GPUMaterial *gpu_material = material_array.gpu_materials[i];
::Material *mat = GPU_material_get_material(gpu_material);
@@ -141,9 +141,8 @@ void SyncModule::sync_mesh(Object *ob,
}
inst_.manager->extract_object_attributes(res_handle, ob_ref, material_array.gpu_materials);
inst_.shadows.sync_object(ob_handle, res_handle, is_shadow_caster, is_alpha_blend);
inst_.cryptomatte.sync_object(ob, res_handle);
// shadows.sync_object(ob, ob_handle, is_shadow_caster, is_alpha_blend);
}
/** \} */
@@ -237,7 +236,7 @@ static void gpencil_stroke_sync(bGPDlayer * /*gpl*/,
{
gpIterData &iter = *(gpIterData *)thunk;
Material *material = &iter.material_array.materials[gps->mat_nr];
Material *material = iter.material_array.materials[gps->mat_nr];
MaterialGPencilStyle *gp_style = BKE_gpencil_material_settings(iter.ob, gps->mat_nr + 1);
bool hide_material = (gp_style->flag & GP_MATERIAL_HIDE) != 0;
@@ -281,9 +280,9 @@ void SyncModule::sync_gpencil(Object *ob, ObjectHandle &ob_handle, ResourceHandl
gpencil_drawcall_flush(iter);
bool is_caster = true; /* TODO material.shadow.sub_pass. */
bool is_alpha_blend = true; /* TODO material.is_alpha_blend. */
inst_.shadows.sync_object(ob_handle, res_handle, is_caster, is_alpha_blend);
// bool is_caster = true; /* TODO material.shadow.sub_pass. */
// bool is_alpha_blend = true; /* TODO material.is_alpha_blend. */
// shadows.sync_object(ob, ob_handle, is_caster, is_alpha_blend);
}
/** \} */
@@ -348,9 +347,9 @@ void SyncModule::sync_curves(Object *ob,
/* TODO(fclem) Hair velocity. */
// shading_passes.velocity.gpencil_add(ob, ob_handle);
bool is_caster = material.shadow.sub_pass != nullptr;
bool is_alpha_blend = material.is_alpha_blend_transparent;
inst_.shadows.sync_object(ob_handle, res_handle, is_caster, is_alpha_blend);
// bool is_caster = material.shadow.sub_pass != nullptr;
// bool is_alpha_blend = material.is_alpha_blend_transparent;
// shadows.sync_object(ob, ob_handle, is_caster, is_alpha_blend);
}
/** \} */

View File

@@ -259,7 +259,7 @@ void VelocityModule::end_sync()
{
Vector<ObjectKey, 0> deleted_obj;
uint32_t max_resource_id_ = 1u;
uint32_t max_resource_id_ = 0u;
for (Map<ObjectKey, VelocityObjectData>::Item item : velocity_map.items()) {
if (item.value.obj.resource_id == uint32_t(-1)) {

View File

@@ -134,12 +134,13 @@ void ShadingView::render()
inst_.lights.debug_draw(render_view_new_, combined_fb_);
inst_.hiz_buffer.debug_draw(render_view_new_, combined_fb_);
inst_.shadows.debug_draw(render_view_new_, combined_fb_);
GPUTexture *combined_final_tx = render_postfx(rbufs.combined_tx);
inst_.film.accumulate(sub_view_, combined_final_tx);
// inst_.shadows.debug_draw();
rbufs.release();
postfx_tx_.release();

View File

@@ -16,7 +16,6 @@
# ifdef OBINFO_LIB
vec3 attr_load_orco(vec4 orco)
{
# ifdef GPU_VERTEX_SHADER
/* We know when there is no orco layer when orco.w is 1.0 because it uses the generic vertex
* attribute (which is [0,0,0,1]). */
if (orco.w == 1.0) {
@@ -24,7 +23,6 @@ vec3 attr_load_orco(vec4 orco)
* using the orco_madd factors. */
return OrcoTexCoFactors[0].xyz + pos * OrcoTexCoFactors[1].xyz;
}
# endif
return orco.xyz * 0.5 + 0.5;
}
# endif

View File

@@ -9,11 +9,6 @@
void main()
{
DRW_VIEW_FROM_RESOURCE_ID;
#ifdef MAT_SHADOW
shadow_interp.view_id = drw_view_id;
#endif
init_interface();
vec3 T;

View File

@@ -7,11 +7,6 @@
void main()
{
DRW_VIEW_FROM_RESOURCE_ID;
#ifdef MAT_SHADOW
shadow_interp.view_id = drw_view_id;
#endif
init_interface();
/* TODO(fclem): Expose through a node? */

View File

@@ -7,11 +7,6 @@
void main()
{
DRW_VIEW_FROM_RESOURCE_ID;
#ifdef MAT_SHADOW
shadow_interp.view_id = drw_view_id;
#endif
init_interface();
interp.P = point_object_to_world(pos);

View File

@@ -22,7 +22,7 @@ void main()
}
/* Sun lights are packed at the end of the array. Perform early copy. */
if (is_sun_light(light.type)) {
if (light.type == LIGHT_SUN) {
/* NOTE: We know the index because sun lights are packed at the start of the input buffer. */
out_light_buf[light_cull_buf.local_lights_len + l_idx] = light;
return;

View File

@@ -12,7 +12,6 @@
*/
#pragma BLENDER_REQUIRE(eevee_light_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_shadow_lib.glsl)
#pragma BLENDER_REQUIRE(gpu_shader_codegen_lib.glsl)
/* TODO(fclem): We could reduce register pressure by only having static branches for sun lights. */
@@ -20,7 +19,6 @@ void light_eval_ex(ClosureDiffuse diffuse,
ClosureReflection reflection,
const bool is_directional,
vec3 P,
vec3 Ng,
vec3 V,
float vP_z,
float thickness,
@@ -36,17 +34,17 @@ void light_eval_ex(ClosureDiffuse diffuse,
float visibility = light_attenuation(light, L, dist);
if (light.tilemap_index != LIGHT_NO_SHADOW && (visibility > 0.0)) {
#if 0 /* TODO(fclem): Shadows */
if ((light.shadow_id != LIGHT_NO_SHADOW) && (visibility > 0.0)) {
vec3 lL = light_world_to_local(light, -L) * dist;
vec3 lNg = light_world_to_local(light, Ng);
ShadowSample samp = shadow_sample(
is_directional, shadow_atlas_tx, shadow_tilemaps_tx, light, lL, lNg, P);
float shadow_delta = shadow_delta_get(
shadow_atlas_tx, shadow_tilemaps_tx, light, light.shadow_data, lL, dist, P);
#ifdef SSS_TRANSMITTANCE
/* Transmittance evaluation first to use initial visibility without shadow. */
# ifdef SSS_TRANSMITTANCE
/* Transmittance evaluation first to use initial visibility. */
if (diffuse.sss_id != 0u && light.diffuse_power > 0.0) {
float delta = max(thickness, samp.occluder_delta + samp.bias);
float delta = max(thickness, shadow_delta);
vec3 intensity = visibility * light.transmit_power *
light_translucent(sss_transmittance_tx,
@@ -59,9 +57,11 @@ void light_eval_ex(ClosureDiffuse diffuse,
delta);
out_diffuse += light.color * intensity;
}
#endif
visibility *= float(samp.occluder_delta + samp.bias >= 0.0);
# endif
visibility *= float(shadow_delta - light.shadow_data.bias <= 0.0);
}
#endif
if (visibility < 1e-6) {
return;
@@ -84,7 +84,6 @@ void light_eval_ex(ClosureDiffuse diffuse,
void light_eval(ClosureDiffuse diffuse,
ClosureReflection reflection,
vec3 P,
vec3 Ng,
vec3 V,
float vP_z,
float thickness,
@@ -101,7 +100,6 @@ void light_eval(ClosureDiffuse diffuse,
reflection,
true,
P,
Ng,
V,
vP_z,
thickness,
@@ -119,7 +117,6 @@ void light_eval(ClosureDiffuse diffuse,
reflection,
false,
P,
Ng,
V,
vP_z,
thickness,

View File

@@ -9,8 +9,7 @@
void light_vector_get(LightData ld, vec3 P, out vec3 L, out float dist)
{
/* TODO(fclem): Static branching. */
if (is_sun_light(ld.type)) {
if (ld.type == LIGHT_SUN) {
L = ld._back;
dist = 1.0;
}
@@ -58,13 +57,10 @@ float light_attenuation(LightData ld, vec3 L, float dist)
if (ld.type == LIGHT_SPOT) {
vis *= light_spot_attenuation(ld, L);
}
if (ld.type >= LIGHT_SPOT) {
vis *= step(0.0, -dot(L, -ld._back));
}
/* TODO(fclem): Static branching. */
if (!is_sun_light(ld.type)) {
if (ld.type != LIGHT_SUN) {
#ifdef VOLUME_LIGHTING
vis *= light_influence_attenuation(dist, ld.influence_radius_invsqr_volume);
#else

View File

@@ -1,196 +0,0 @@
/**
* Debug drawing for virtual shadowmaps.
* See eShadowDebug for more information.
*/
#pragma BLENDER_REQUIRE(common_debug_print_lib.glsl)
#pragma BLENDER_REQUIRE(common_view_lib.glsl)
#pragma BLENDER_REQUIRE(common_math_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_light_iter_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_light_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_shadow_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_sampling_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_shadow_tilemap_lib.glsl)
/** Control the scaling of the tilemap splat. */
const float pixel_scale = 4.0;
vec3 debug_random_color(ivec2 v)
{
float r = interlieved_gradient_noise(vec2(v), 0.0, 0.0);
return hue_gradient(r);
}
vec3 debug_random_color(int v)
{
return debug_random_color(ivec2(v, 0));
}
void debug_tile_print(ShadowTileData tile, ivec4 tile_coord)
{
drw_print("Tile (", tile_coord.x, ",", tile_coord.y, ") in Tilemap ", tile_coord.z, " : ");
drw_print(tile.lod);
drw_print(tile.page);
drw_print(tile.cache_index);
}
vec3 debug_tile_state_color(ShadowTileData tile)
{
if (tile.lod > 0) {
/* Uses data from another LOD. */
return neon_gradient(float(tile.lod) / float(SHADOW_TILEMAP_LOD));
}
if (tile.do_update && tile.is_used) {
/* Updated. */
return vec3(0.5, 1, 0);
}
if (tile.is_used) {
/* Used but was cached. */
return vec3(0, 1, 0);
}
vec3 col = vec3(0);
if (tile.is_cached) {
col += vec3(0.2, 0, 0.5);
if (tile.do_update) {
col += vec3(0.8, 0, 0);
}
}
return col;
}
ShadowSample debug_tile_get(vec3 P, LightData light)
{
vec3 lNg = vec3(1.0, 0.0, 0.0);
if (is_sun_light(light.type)) {
return shadow_directional_sample_get(shadow_atlas_tx, shadow_tilemaps_tx, light, P, lNg);
}
else {
vec3 lL = light_world_to_local(light, P - light._position);
return shadow_punctual_sample_get(shadow_atlas_tx, shadow_tilemaps_tx, light, lL, lNg);
}
}
LightData debug_light_get()
{
LIGHT_FOREACH_BEGIN_LOCAL_NO_CULL(light_cull_buf, l_idx)
{
LightData light = light_buf[l_idx];
if (light.tilemap_index == debug_tilemap_index) {
return light;
}
}
LIGHT_FOREACH_END
LIGHT_FOREACH_BEGIN_DIRECTIONAL(light_cull_buf, l_idx)
{
LightData light = light_buf[l_idx];
if (light.tilemap_index == debug_tilemap_index) {
return light;
}
}
LIGHT_FOREACH_END
}
/** Return true if a pixel was written. */
bool debug_tilemaps(vec3 P, LightData light)
{
const int debug_tile_size_px = 4;
ivec2 px = ivec2(gl_FragCoord.xy) / debug_tile_size_px;
int tilemap = px.x / SHADOW_TILEMAP_RES;
int tilemap_index = light.tilemap_index + tilemap;
if ((px.y < SHADOW_TILEMAP_RES) && (tilemap_index <= light_tilemap_max_get(light))) {
/* Debug actual values in the tilemap buffer. */
ShadowTileMapData tilemap = tilemaps_buf[tilemap_index];
int tile_index = shadow_tile_offset(px % SHADOW_TILEMAP_RES, tilemap.tiles_index, 0);
ShadowTileData tile = shadow_tile_unpack(tiles_buf[tile_index]);
/* Leave 1 px border between tilemaps. */
if (!any(
equal(ivec2(gl_FragCoord.xy) % (SHADOW_TILEMAP_RES * debug_tile_size_px), ivec2(0)))) {
gl_FragDepth = 0.0;
out_color_add = vec4(debug_tile_state_color(tile), 0.0);
out_color_mul = vec4(0.0);
if (ivec2(gl_FragCoord.xy) == ivec2(0)) {
drw_print(light.object_mat);
}
return true;
}
}
return false;
}
void debug_tile_state(vec3 P, LightData light)
{
ShadowSample samp = debug_tile_get(P, light);
out_color_add = vec4(debug_tile_state_color(samp.tile), 0) * 0.5;
out_color_mul = vec4(0.5);
}
void debug_atlas_values(vec3 P, LightData light)
{
ShadowSample samp = debug_tile_get(P, light);
out_color_add = vec4(vec3(samp.occluder_dist), 0);
out_color_mul = vec4(0.0);
}
void debug_random_tile_color(vec3 P, LightData light)
{
ShadowSample samp = debug_tile_get(P, light);
out_color_add = vec4(debug_random_color(ivec2(samp.tile.page)), 0) * 0.5;
out_color_mul = vec4(0.5);
}
void debug_random_tilemap_color(vec3 P, LightData light)
{
ShadowCoordinates coord;
if (is_sun_light(light.type)) {
vec3 lP = shadow_world_to_local(light, P);
coord = shadow_directional_coordinates(light, lP);
}
else {
vec3 lP = light_world_to_local(light, P - light._position);
int face_id = shadow_punctual_face_index_get(lP);
lP = shadow_punctual_local_position_to_face_local(face_id, lP);
coord = shadow_punctual_coordinates(light, lP, face_id);
}
out_color_add = vec4(debug_random_color(ivec2(coord.tilemap_index)), 0) * 0.5;
out_color_mul = vec4(0.5);
}
void main()
{
/* Default to no output. */
gl_FragDepth = 1.0;
out_color_add = vec4(0.0);
out_color_mul = vec4(1.0);
float depth = texelFetch(hiz_tx, ivec2(gl_FragCoord.xy), 0).r;
vec3 P = get_world_space_from_depth(uvcoordsvar.xy, depth);
/* Make it pass the depth test. */
gl_FragDepth = depth - 1e-6;
LightData light = debug_light_get();
if (debug_tilemaps(P, light)) {
return;
}
if (depth != 1.0) {
switch (eDebugMode(debug_mode)) {
case DEBUG_SHADOW_TILEMAPS:
debug_tile_state(P, light);
break;
case DEBUG_SHADOW_VALUES:
debug_atlas_values(P, light);
break;
case DEBUG_SHADOW_TILE_RANDOM_COLOR:
debug_random_tile_color(P, light);
break;
case DEBUG_SHADOW_TILEMAP_RANDOM_COLOR:
debug_random_tilemap_color(P, light);
break;
}
}
}

View File

@@ -1,220 +0,0 @@
#pragma BLENDER_REQUIRE(eevee_shadow_tilemap_lib.glsl)
/** \a unormalized_uv is the uv coordinates for the whole tilemap [0..SHADOW_TILEMAP_RES]. */
vec2 shadow_page_uv_transform(
vec2 atlas_size, uvec2 page, uint lod, vec2 unormalized_uv, ivec2 tile_lod0_coord)
{
/* Bias uv sample for LODs since custom raster aligns LOD pixels instead of centering them. */
if (lod != 0) {
unormalized_uv += 0.5 / float(SHADOW_PAGE_RES * SHADOW_TILEMAP_RES);
}
float lod_scaling = exp2(-float(lod));
vec2 target_tile = vec2(tile_lod0_coord >> lod);
vec2 page_uv = unormalized_uv * lod_scaling - target_tile;
/* Assumes atlas is squared. */
vec2 atlas_uv = (vec2(page) + min(page_uv, 0.99999)) * float(SHADOW_PAGE_RES) / atlas_size;
return atlas_uv;
}
/* Rotate vector to light's local space . Used for directional shadows. */
vec3 shadow_world_to_local(LightData ld, vec3 L)
{
/* Avoid relying on compiler to optimize this.
* vec3 lL = transpose(mat3(ld.object_mat)) * L; */
vec3 lL;
lL.x = dot(ld.object_mat[0].xyz, L);
lL.y = dot(ld.object_mat[1].xyz, L);
lL.z = dot(ld.object_mat[2].xyz, L);
return lL;
}
/* TODO(fclem) use utildef version. */
float shadow_orderedIntBitsToFloat(int int_value)
{
return intBitsToFloat((int_value < 0) ? (int_value ^ 0x7FFFFFFF) : int_value);
}
/* ---------------------------------------------------------------------- */
/** \name Shadow Sampling Functions
* \{ */
/* Turns local light coordinate into shadow region index. Matches eCubeFace order.
* \note lL does not need to be normalized. */
int shadow_punctual_face_index_get(vec3 lL)
{
vec3 aP = abs(lL);
if (all(greaterThan(aP.xx, aP.yz))) {
return (lL.x > 0.0) ? 1 : 2;
}
else if (all(greaterThan(aP.yy, aP.xz))) {
return (lL.y > 0.0) ? 3 : 4;
}
else {
return (lL.z > 0.0) ? 5 : 0;
}
}
mat4x4 shadow_load_normal_matrix(LightData light)
{
if (!is_sun_light(light.type)) {
/* FIXME: Why? */
float scale = 0.5;
return mat4x4(vec4(scale, 0.0, 0.0, 0.0),
vec4(0.0, scale, 0.0, 0.0),
vec4(0.0, 0.0, 0.0, -1.0),
vec4(0.0, 0.0, light.normal_mat_packed.x, light.normal_mat_packed.y));
}
else {
float near = shadow_orderedIntBitsToFloat(light.clip_near);
float far = shadow_orderedIntBitsToFloat(light.clip_far);
/* Could be store precomputed inside the light struct. Just have to find a how to update it. */
float z_scale = (far - near) * 0.5;
return mat4x4(vec4(light.normal_mat_packed.x, 0.0, 0.0, 0.0),
vec4(0.0, light.normal_mat_packed.x, 0.0, 0.0),
vec4(0.0, 0.0, z_scale, 0.0),
vec4(0.0, 0.0, 0.0, 1.0));
}
}
/* Returns minimum bias (in world space unit) needed for a given geometry normal and a shadowmap
* page to avoid self shadowing artifacts. Note that this can return a negative bias to better
* match the surface. */
float shadow_slope_bias_get(vec2 atlas_size, LightData light, vec3 lNg, vec3 lP, vec2 uv, uint lod)
{
/* Compute coordinate inside the pixel we are sampling. */
vec2 uv_subpixel_coord = fract(uv * atlas_size);
/* Bias uv sample for LODs since custom raster aligns LOD pixels instead of centering them. */
uv_subpixel_coord += (lod > 0) ? -exp2(-1.0 - float(lod)) : 0.0;
/* Compute delta to the texel center (where the sample is). */
vec2 ndc_texel_center_delta = uv_subpixel_coord * 2.0 - 1.0;
/* Create a normal plane equation and go through the normal projection matrix. */
vec4 lNg_plane = vec4(lNg, -dot(lNg, lP));
vec4 ndc_Ng = shadow_load_normal_matrix(light) * lNg_plane;
/* Get slope from normal vector. Note that this is signed. */
vec2 ndc_slope = ndc_Ng.xy / abs(ndc_Ng.z);
/* Clamp out to avoid the bias going to infinity. Remember this is in NDC space. */
ndc_slope = clamp(ndc_slope, -100.0, 100.0);
/* Compute slope to where the receiver should be by extending the plane to the texel center. */
float bias = dot(ndc_slope, ndc_texel_center_delta);
/* Bias for 1 pixel of the sampled LOD. */
bias /= ((SHADOW_TILEMAP_RES * SHADOW_PAGE_RES) >> lod);
return bias;
}
struct ShadowSample {
/* Signed delta in world units from the shading point to the occluder. Negative if occluded. */
float occluder_delta;
/* Tile coordinate inside the tilemap [0..SHADOW_TILEMAP_RES). */
ivec2 tile_coord;
/* UV coordinate inside the tilemap [0..SHADOW_TILEMAP_RES). */
vec2 uv;
/* Minimum slope bias to apply during comparison. */
float bias;
/* Distance from near clip plane in world space units. */
float occluder_dist;
/* Tile used loaded for page indirection. */
ShadowTileData tile;
};
float shadow_tile_depth_get(usampler2D atlas_tx, ShadowTileData tile, vec2 atlas_uv)
{
if (!tile.is_allocated) {
/* Far plane distance but with a bias to make sure there will be no shadowing.
* But also not FLT_MAX since it can cause issue with projection. */
return 1.1;
}
return uintBitsToFloat(texture(atlas_tx, atlas_uv).r);
}
vec2 shadow_punctual_linear_depth(vec2 z, float near, float far)
{
vec2 d = z * 2.0 - 1.0;
float z_delta = far - near;
/* Can we simplify? */
return ((-2.0 * near * far) / z_delta) / (d + (-(far + near) / z_delta));
}
float shadow_directional_linear_depth(float z, float near, float far)
{
return z * (near - far) - near;
}
ShadowSample shadow_punctual_sample_get(
usampler2D atlas_tx, usampler2D tilemaps_tx, LightData light, vec3 lP, vec3 lNg)
{
int face_id = shadow_punctual_face_index_get(lP);
lNg = shadow_punctual_local_position_to_face_local(face_id, lNg);
lP = shadow_punctual_local_position_to_face_local(face_id, lP);
ShadowCoordinates coord = shadow_punctual_coordinates(light, lP, face_id);
vec2 atlas_size = vec2(textureSize(atlas_tx, 0).xy);
ShadowSample samp;
samp.tile = shadow_tile_load(tilemaps_tx, coord.tile_coord, coord.tilemap_index);
samp.uv = shadow_page_uv_transform(
atlas_size, samp.tile.page, samp.tile.lod, coord.uv, coord.tile_coord);
samp.bias = shadow_slope_bias_get(atlas_size, light, lNg, lP, samp.uv, samp.tile.lod);
float occluder_ndc = shadow_tile_depth_get(atlas_tx, samp.tile, samp.uv);
/* NOTE: Given to be both positive, so can use intBitsToFloat instead of orderedInt version. */
float near = intBitsToFloat(light.clip_near);
float far = intBitsToFloat(light.clip_far);
/* Shadow is stored as gl_FragCoord.z. Convert to radial distance along with the bias. */
vec2 occluder = vec2(occluder_ndc, saturate(occluder_ndc + samp.bias));
vec2 occluder_z = shadow_punctual_linear_depth(occluder, near, far);
float receiver_dist = length(lP);
float radius_divisor = receiver_dist / abs(lP.z);
samp.occluder_dist = occluder_z.x * radius_divisor;
samp.bias = (occluder_z.y - occluder_z.x) * radius_divisor;
samp.occluder_delta = samp.occluder_dist - receiver_dist;
return samp;
}
ShadowSample shadow_directional_sample_get(
usampler2D atlas_tx, usampler2D tilemaps_tx, LightData light, vec3 P, vec3 lNg)
{
vec3 lP = shadow_world_to_local(light, P);
ShadowCoordinates coord = shadow_directional_coordinates(light, lP);
vec2 atlas_size = vec2(textureSize(atlas_tx, 0).xy);
ShadowSample samp;
samp.tile = shadow_tile_load(tilemaps_tx, coord.tile_coord, coord.tilemap_index);
samp.uv = shadow_page_uv_transform(
atlas_size, samp.tile.page, samp.tile.lod, coord.uv, coord.tile_coord);
samp.bias = shadow_slope_bias_get(atlas_size, light, lNg, lP, samp.uv, samp.tile.lod);
samp.bias *= exp2(float(coord.lod_relative));
float occluder_ndc = shadow_tile_depth_get(atlas_tx, samp.tile, samp.uv);
float near = shadow_orderedIntBitsToFloat(light.clip_near);
float far = shadow_orderedIntBitsToFloat(light.clip_far);
samp.occluder_dist = shadow_directional_linear_depth(occluder_ndc, near, far);
/* Receiver distance needs to also be increasing.
* Negate since Z distance follows blender camera convention of -Z as forward. */
float receiver_dist = -lP.z;
samp.bias *= near - far;
samp.occluder_delta = samp.occluder_dist - receiver_dist;
return samp;
}
ShadowSample shadow_sample(const bool is_directional,
usampler2D atlas_tx,
usampler2D tilemaps_tx,
LightData light,
vec3 lL,
vec3 lNg,
vec3 P)
{
if (is_directional) {
return shadow_directional_sample_get(atlas_tx, tilemaps_tx, light, P, lNg);
}
else {
return shadow_punctual_sample_get(atlas_tx, tilemaps_tx, light, lL, lNg);
}
}
/** \} */

View File

@@ -1,41 +0,0 @@
/**
* Virtual shadowmapping: Allocation.
*
* Allocates pages to tiles needing them.
* Note that allocation can fail, in this case the tile is left with no page.
*/
#pragma BLENDER_REQUIRE(eevee_shadow_page_ops_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_shadow_tilemap_lib.glsl)
void main()
{
ShadowTileMapData tilemap_data = tilemaps_buf[gl_GlobalInvocationID.z];
int tile_start = tilemap_data.tiles_index;
for (int lod = 0; lod <= SHADOW_TILEMAP_LOD; lod++) {
int lod_len = SHADOW_TILEMAP_LOD0_LEN >> (lod * 2);
int local_tile = int(gl_LocalInvocationID.x);
if (local_tile < lod_len) {
int tile_index = tile_start + local_tile;
ShadowTileData tile = shadow_tile_unpack(tiles_buf[tile_index]);
if (tile.is_used && !tile.is_allocated) {
shadow_page_alloc(tile);
tiles_buf[tile_index] = shadow_tile_pack(tile);
}
if (tile.is_used) {
atomicAdd(statistics_buf.page_used_count, 1);
}
if (tile.is_used && tile.do_update) {
atomicAdd(statistics_buf.page_update_count, 1);
}
if (tile.is_allocated) {
atomicAdd(statistics_buf.page_allocated_count, 1);
}
}
tile_start += lod_len;
}
}

View File

@@ -1,16 +0,0 @@
/**
* Virtual shadowmapping: Page Clear.
*
* Equivalent to a framebuffer depth clear but only for pages pushed to the clear_page_buf.
*/
#pragma BLENDER_REQUIRE(common_math_lib.glsl)
void main()
{
uvec2 page_co = unpackUvec2x16(clear_page_buf[gl_GlobalInvocationID.z]);
uvec2 page_texel = page_co * pages_infos_buf.page_size + gl_GlobalInvocationID.xy;
imageStore(atlas_img, ivec2(page_texel), uvec4(floatBitsToUint(1.0)));
}

View File

@@ -1,129 +0,0 @@
/**
* Virtual shadowmapping: Defrag.
*
* Defragment the cached page buffer making one continuous array.
*
* Also pop_front the cached pages if there is not enough free pages for the needed allocations.
* Here is an example of the behavior of this buffer during one update cycle:
*
* Initial state: 5 cached pages. Buffer starts at index 2 and ends at 6.
* [--xxxxx---------]
* After page free step: 2 cached pages were removed (r), 3 pages were inserted in the cache (i).
* [--xrxrxiii------]
* After page defrag step: The buffer is compressed into only 6 pages.
* [----xxxxxx------]
*/
#pragma BLENDER_REQUIRE(eevee_shadow_page_ops_lib.glsl)
const uint max_page = SHADOW_MAX_PAGE;
void find_first_valid(inout uint src, uint dst)
{
for (; src < dst; src++) {
if (pages_cached_buf[src % max_page].x != uint(-1)) {
return;
}
}
}
void page_cached_free(uint page_index)
{
uint tile_index = pages_cached_buf[page_index].y;
ShadowTileData tile = shadow_tile_unpack(tiles_buf[tile_index]);
shadow_page_cache_remove(tile);
shadow_page_free(tile);
tiles_buf[tile_index] = shadow_tile_pack(tile);
}
#if 0 /* Can be used to debug heap and invalid pages inside the free buffer. */
# define check_heap_integrity(heap, start, size, invalid_val, result) \
result = true; \
for (int i = 0; i < max_page; i++) { \
if ((i >= start) && (i < (start + size))) { \
result = result && (heap[i].x != invalid_val); \
} \
else { \
result = result && (heap[i].x == invalid_val); \
} \
}
#else
# define check_heap_integrity(heap, start, size, invalid_val, result)
#endif
void main()
{
/* Pages we need to get off the cache for the allocation pass. */
int additional_pages = pages_infos_buf.page_alloc_count - pages_infos_buf.page_free_count;
uint src = pages_infos_buf.page_cached_start;
uint end = pages_infos_buf.page_cached_end;
find_first_valid(src, end);
bool valid_pre;
check_heap_integrity(pages_free_buf, 0, pages_infos_buf.page_free_count, uint(-1), valid_pre);
/* First free as much pages as needed from the end of the cached range to fulfill the allocation.
* Avoid defragmenting to then free them. */
for (; additional_pages > 0 && src < end; additional_pages--) {
page_cached_free(src % max_page);
find_first_valid(src, end);
}
/* Defrag page in "old" range. */
bool is_empty = (src == end);
if (!is_empty) {
/* `page_cached_end` refers to the next empty slot.
* Decrement by one to refer to the first slot we can defrag. */
for (uint dst = end - 1; dst > src; dst--) {
/* Find hole. */
if (pages_cached_buf[dst % max_page].x != uint(-1)) {
continue;
}
/* Update corresponding reference in tile. */
shadow_page_cache_update_page_ref(src % max_page, dst % max_page);
/* Move page. */
pages_cached_buf[dst % max_page] = pages_cached_buf[src % max_page];
pages_cached_buf[src % max_page] = uvec2(-1);
find_first_valid(src, dst);
}
}
end = pages_infos_buf.page_cached_next;
/* Free pages in the "new" range (these are compact). */
for (; additional_pages > 0 && src < end; additional_pages--, src++) {
page_cached_free(src % max_page);
}
bool valid_post;
check_heap_integrity(pages_free_buf, 0, pages_infos_buf.page_free_count, uint(-1), valid_post);
pages_infos_buf.page_cached_start = src;
pages_infos_buf.page_cached_end = end;
pages_infos_buf.page_alloc_count = 0;
pages_infos_buf.view_count = 0;
/* Stats. */
statistics_buf.page_used_count = 0;
statistics_buf.page_update_count = 0;
statistics_buf.page_allocated_count = 0;
statistics_buf.page_rendered_count = 0;
/* Wrap the cursor to avoid unsigned overflow. We do not do modulo arithmetic because it would
* produce a 0 length buffer if the buffer is full. */
if (pages_infos_buf.page_cached_start > max_page) {
pages_infos_buf.page_cached_next -= max_page;
pages_infos_buf.page_cached_start -= max_page;
pages_infos_buf.page_cached_end -= max_page;
}
/* Reset clear command indirect buffer. */
clear_dispatch_buf.num_groups_x = pages_infos_buf.page_size / SHADOW_PAGE_CLEAR_GROUP_SIZE;
clear_dispatch_buf.num_groups_y = pages_infos_buf.page_size / SHADOW_PAGE_CLEAR_GROUP_SIZE;
clear_dispatch_buf.num_groups_z = 0;
}

View File

@@ -1,54 +0,0 @@
/**
* Virtual shadowmapping: Tile page freeing.
*
* Releases the allocated pages held by tilemaps that have been become unused.
* Also reclaim cached pages if the tiles needs them.
* Note that we also count the number of new page allocations needed.
*/
#pragma BLENDER_REQUIRE(eevee_shadow_page_ops_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_shadow_tilemap_lib.glsl)
void main()
{
ShadowTileMapData tilemap_data = tilemaps_buf[gl_GlobalInvocationID.z];
int tile_start = tilemap_data.tiles_index;
for (int lod = 0; lod <= SHADOW_TILEMAP_LOD; lod++) {
int lod_len = SHADOW_TILEMAP_LOD0_LEN >> (lod * 2);
int local_tile = int(gl_LocalInvocationID.x);
if (local_tile < lod_len) {
int tile_index = tile_start + local_tile;
ShadowTileData tile = shadow_tile_unpack(tiles_buf[tile_index]);
bool is_orphaned = !tile.is_used && tile.do_update;
if (is_orphaned) {
if (tile.is_cached) {
shadow_page_cache_remove(tile);
}
if (tile.is_allocated) {
shadow_page_free(tile);
}
}
if (tile.is_used) {
if (tile.is_cached) {
shadow_page_cache_remove(tile);
}
if (!tile.is_allocated) {
atomicAdd(pages_infos_buf.page_alloc_count, 1);
}
}
else {
if (tile.is_allocated) {
shadow_page_cache_append(tile, tile_index);
}
}
tiles_buf[tile_index] = shadow_tile_pack(tile);
}
tile_start += lod_len;
}
}

View File

@@ -1,55 +0,0 @@
/**
* Virtual shadowmapping: Usage un-tagging
*
* Remove used tag from masked tiles (LOD overlap).
*/
#pragma BLENDER_REQUIRE(eevee_shadow_tilemap_lib.glsl)
shared uint usage_grid[SHADOW_TILEMAP_RES / 2][SHADOW_TILEMAP_RES / 2];
void main()
{
uint tilemap_index = gl_GlobalInvocationID.z;
ShadowTileMapData tilemap = tilemaps_buf[tilemap_index];
if (tilemap.projection_type == SHADOW_PROJECTION_CUBEFACE) {
/* For each level collect the number of used (or masked) tile that are covering the tile from
* the level underneath. If this adds up to 4 the underneath tile is flag unused as its data
* is not needed for rendering.
*
* This is because 2 receivers can tag used the same area of the shadowmap but with different
* LODs. */
bool is_used = false;
ivec2 tile_co = ivec2(gl_GlobalInvocationID.xy);
uint lod_size = uint(SHADOW_TILEMAP_RES);
for (int lod = 0; lod <= SHADOW_TILEMAP_LOD; lod++, lod_size >>= 1u) {
bool thread_active = all(lessThan(tile_co, ivec2(lod_size)));
barrier();
ShadowTileData tile;
if (thread_active) {
int tile_offset = shadow_tile_offset(tile_co, tilemap.tiles_index, lod);
tile = shadow_tile_unpack(tiles_buf[tile_offset]);
if (lod > 0 && usage_grid[tile_co.y][tile_co.x] == 4u) {
/* Remove the usage flag as this tile is completely covered by higher LOD tiles. */
tiles_buf[tile_offset] &= ~SHADOW_IS_USED;
/* Consider this tile occluding lower levels. */
tile.is_used = true;
}
/* Reset count for next level. */
usage_grid[tile_co.y][tile_co.x] = 0u;
}
barrier();
if (thread_active) {
if (tile.is_used) {
atomicAdd(usage_grid[tile_co.y / 2][tile_co.x / 2], 1u);
}
}
}
}
}

View File

@@ -1,108 +0,0 @@
/**
* Operations to move virtual shadow map pages between heaps and tiles.
* We reuse the blender::vector class denomination.
*
* The needed resources for this lib are:
* - tiles_buf
* - pages_free_buf
* - pages_cached_buf
* - pages_infos_buf
*
* A page is can be in 3 state (free, cached, acquired). Each one correspond to a different owner.
*
* - The pages_free_buf works in a regular stack containing only the page coordinates.
*
* - The pages_cached_buf is a ring buffer where newly cached pages gets added at the end and the
* old cached pages gets defragmented at the start of the used portion.
*
* - The tiles_buf only owns a page if it is used. If the page is cached, the tile contains a
* reference index inside the pages_cached_buf.
*
* IMPORTANT: Do not forget to manually store the tile data after doing operations on them.
*/
#pragma BLENDER_REQUIRE(eevee_shadow_tilemap_lib.glsl)
/* TODO(@fclem): Implement. */
#define assert(check)
/* Remove page ownership from the tile and append it to the cache. */
void shadow_page_free(inout ShadowTileData tile)
{
assert(tile.is_allocated);
int index = atomicAdd(pages_infos_buf.page_free_count, 1);
assert(index < SHADOW_MAX_PAGE);
/* Insert in heap. */
pages_free_buf[index] = packUvec2x16(tile.page);
/* Remove from tile. */
tile.page = uvec2(-1);
tile.is_cached = false;
tile.is_allocated = false;
}
/* Remove last page from the free heap and give ownership to the tile. */
void shadow_page_alloc(inout ShadowTileData tile)
{
assert(!tile.is_allocated);
int index = atomicAdd(pages_infos_buf.page_free_count, -1) - 1;
/* This can easily happen in really big scene. */
if (index < 0) {
return;
}
/* Insert in tile. */
tile.page = unpackUvec2x16(pages_free_buf[index]);
tile.is_allocated = true;
tile.do_update = true;
/* Remove from heap. */
pages_free_buf[index] = uint(-1);
}
/* Remove page ownership from the tile cache and append it to the cache. */
void shadow_page_cache_append(inout ShadowTileData tile, uint tile_index)
{
assert(tile.is_allocated);
/* The page_cached_next is also wrapped in the defrag phase to avoid unsigned overflow. */
uint index = atomicAdd(pages_infos_buf.page_cached_next, 1u) % uint(SHADOW_MAX_PAGE);
/* Insert in heap. */
pages_cached_buf[index] = uvec2(packUvec2x16(tile.page), tile_index);
/* Remove from tile. */
tile.page = uvec2(-1);
tile.cache_index = index;
tile.is_cached = true;
tile.is_allocated = false;
}
/* Remove page from cache and give ownership to the tile. */
void shadow_page_cache_remove(inout ShadowTileData tile)
{
assert(!tile.is_allocated);
assert(tile.is_cached);
uint index = tile.cache_index;
/* Insert in tile. */
tile.page = unpackUvec2x16(pages_cached_buf[index].x);
tile.cache_index = uint(-1);
tile.is_cached = false;
tile.is_allocated = true;
/* Remove from heap. Leaves hole in the buffer. This is handled by the defrag phase. */
pages_cached_buf[index] = uvec2(-1);
}
/* Update cached page reference when a cached page moves inside the cached page buffer. */
void shadow_page_cache_update_page_ref(uint page_index, uint new_page_index)
{
uint tile_index = pages_cached_buf[page_index].y;
ShadowTileData tile = shadow_tile_unpack(tiles_buf[tile_index]);
tile.cache_index = new_page_index;
tiles_buf[tile_index] = shadow_tile_pack(tile);
}
/* Update cached page reference when a tile referencing a cached page moves inside the tilemap. */
void shadow_page_cache_update_tile_ref(uint page_index, uint new_tile_index)
{
pages_cached_buf[page_index].y = new_tile_index;
}

View File

@@ -1,94 +0,0 @@
/**
* Virtual shadowmapping: Update tagging
*
* Any updated shadow caster needs to tag the shadow map tiles it was in and is now into.
* This is done in 2 pass of this same shader. One for past object bounds and one for new object
* bounds. The bounding boxes are roughly software rasterized (just a plain rect) in order to tag
* the appropriate tiles.
*/
#pragma BLENDER_REQUIRE(common_intersect_lib.glsl)
#pragma BLENDER_REQUIRE(common_view_lib.glsl)
#pragma BLENDER_REQUIRE(common_aabb_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_shadow_tilemap_lib.glsl)
vec3 safe_project(mat4 winmat, mat4 viewmat, inout int clipped, vec3 v)
{
vec4 tmp = winmat * (viewmat * vec4(v, 1.0));
/* Detect case when point is behind the camera. */
clipped += int(tmp.w < 0.0);
return tmp.xyz / tmp.w;
}
void main()
{
ShadowTileMapData tilemap = tilemaps_buf[gl_GlobalInvocationID.z];
IsectPyramid frustum;
if (tilemap.projection_type == SHADOW_PROJECTION_CUBEFACE) {
Pyramid pyramid = shadow_tilemap_cubeface_bounds(tilemap, ivec2(0), ivec2(SHADOW_TILEMAP_RES));
frustum = isect_data_setup(pyramid);
}
uint resource_id = resource_ids_buf[gl_GlobalInvocationID.x];
IsectBox box = isect_data_setup(bounds_buf[resource_id].bounding_corners[0].xyz,
bounds_buf[resource_id].bounding_corners[1].xyz,
bounds_buf[resource_id].bounding_corners[2].xyz,
bounds_buf[resource_id].bounding_corners[3].xyz);
int clipped = 0;
/* NDC space post projection [-1..1] (unclamped). */
AABB aabb_ndc = aabb_init_min_max();
for (int v = 0; v < 8; v++) {
aabb_merge(aabb_ndc, safe_project(tilemap.winmat, tilemap.viewmat, clipped, box.corners[v]));
}
if (tilemap.projection_type == SHADOW_PROJECTION_CUBEFACE) {
if (clipped == 8) {
/* All verts are behind the camera. */
return;
}
else if (clipped > 0) {
/* Not all verts are behind the near clip plane. */
if (intersect(frustum, box)) {
/* We cannot correctly handle this case so we fallback by covering the whole view. */
aabb_ndc.max = vec3(1.0);
aabb_ndc.min = vec3(-1.0);
}
else {
/* Still out of the frustum. Ignore. */
return;
}
}
}
AABB aabb_tag;
AABB aabb_map = AABB(vec3(-0.99999), vec3(0.99999));
/* Directionnal winmat have no correct near/far in the Z dimension at this point.
* Do not clip in this dimension. */
if (tilemap.projection_type != SHADOW_PROJECTION_CUBEFACE) {
aabb_map.min.z = -FLT_MAX;
aabb_map.max.z = FLT_MAX;
}
if (!aabb_clip(aabb_map, aabb_ndc, aabb_tag)) {
return;
}
/* Raster the bounding rectangle of the Box projection. */
const float tilemap_half_res = float(SHADOW_TILEMAP_RES / 2);
ivec2 box_min = ivec2(aabb_tag.min.xy * tilemap_half_res + tilemap_half_res);
ivec2 box_max = ivec2(aabb_tag.max.xy * tilemap_half_res + tilemap_half_res);
for (int lod = 0; lod <= SHADOW_TILEMAP_LOD; lod++, box_min >>= 1, box_max >>= 1) {
for (int y = box_min.y; y <= box_max.y; y++) {
for (int x = box_min.x; x <= box_max.x; x++) {
int tile_index = shadow_tile_offset(ivec2(x, y), tilemap.tiles_index, lod);
atomicOr(tiles_buf[tile_index], SHADOW_DO_UPDATE);
}
}
}
}

View File

@@ -1,32 +0,0 @@
/**
* Virtual shadowmapping: Usage tagging
*
* Shadow pages are only allocated if they are visible.
* This pass scan the depth buffer and tag all tiles that are needed for light shadowing as
* needed.
*/
#pragma BLENDER_REQUIRE(eevee_shadow_tag_usage_lib.glsl)
void main()
{
ivec2 texel = ivec2(gl_GlobalInvocationID.xy);
ivec2 tex_size = textureSize(depth_tx, 0).xy;
if (!in_range_inclusive(texel, ivec2(0), ivec2(tex_size - 1))) {
return;
}
float depth = texelFetch(depth_tx, texel, 0).r;
if (depth == 1.0) {
return;
}
vec2 uv = vec2(texel) / vec2(tex_size);
vec3 vP = get_view_space_from_depth(uv, depth);
vec3 P = transform_point(ViewMatrixInverse, vP);
vec2 pixel = vec2(gl_GlobalInvocationID.xy);
shadow_tag_usage(vP, P, pixel);
}

View File

@@ -1,15 +0,0 @@
/**
* Virtual shadowmapping: Usage tagging
*
* Shadow pages are only allocated if they are visible.
* This pass scan the depth buffer and tag all tiles that are needed for light shadowing as
* needed.
*/
#pragma BLENDER_REQUIRE(eevee_shadow_tag_usage_lib.glsl)
void main()
{
shadow_tag_usage(interp.vP, interp.P, gl_FragCoord.xy);
}

View File

@@ -1,106 +0,0 @@
/**
* Virtual shadowmapping: Usage tagging
*
* Shadow pages are only allocated if they are visible.
* This pass scan the depth buffer and tag all tiles that are needed for light shadowing as
* needed.
*/
#pragma BLENDER_REQUIRE(common_intersect_lib.glsl)
#pragma BLENDER_REQUIRE(common_math_geom_lib.glsl)
#pragma BLENDER_REQUIRE(common_view_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_light_iter_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_light_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_shadow_lib.glsl)
void shadow_tag_usage_tilemap(uint l_idx, vec3 P, float dist_to_cam, const bool is_directional)
{
LightData light = light_buf[l_idx];
if (light.tilemap_index == LIGHT_NO_SHADOW) {
return;
}
int lod = 0;
ivec2 tile_co;
int tilemap_index = light.tilemap_index;
if (is_directional) {
vec3 lP = shadow_world_to_local(light, P);
ShadowCoordinates coord = shadow_directional_coordinates(light, lP);
tile_co = coord.tile_coord;
tilemap_index = coord.tilemap_index;
}
else {
vec3 lP = light_world_to_local(light, P - light._position);
float dist_to_light = length(lP);
if (dist_to_light > light.influence_radius_max) {
return;
}
if (light.type == LIGHT_SPOT) {
/* Early out if out of cone. */
float angle_tan = length(lP.xy / dist_to_light);
if (angle_tan > light.spot_tan) {
return;
}
}
else if (is_area_light(light.type)) {
/* Early out if on the wrong side. */
if (lP.z > 0.0) {
return;
}
}
/* How much a shadow map pixel covers a final image pixel.
* We project a shadow map pixel (as a sphere for simplicity) to the receiver plane.
* We then reproject this sphere onto the camera screen and compare it to the film pixel size.
* This gives a good approximation of what LOD to select to get a somewhat uniform shadow map
* resolution in screen space. */
float footprint_ratio = dist_to_light;
/* Project the radius to the screen. 1 unit away from the camera the same way
* pixel_world_radius_inv was computed. Not needed in orthographic mode. */
bool is_persp = (ProjectionMatrix[3][3] == 0.0);
if (is_persp) {
footprint_ratio /= dist_to_cam;
}
/* Apply resolution ratio. */
footprint_ratio *= tilemap_projection_ratio;
int face_id = shadow_punctual_face_index_get(lP);
lP = shadow_punctual_local_position_to_face_local(face_id, lP);
ShadowCoordinates coord = shadow_punctual_coordinates(light, lP, face_id);
tile_co = coord.tile_coord;
tilemap_index = coord.tilemap_index;
lod = int(ceil(-log2(footprint_ratio) + tilemaps_buf[tilemap_index].lod_bias));
lod = clamp(lod, 0, SHADOW_TILEMAP_LOD);
}
tile_co >>= lod;
if (tilemap_index > light_tilemap_max_get(light)) {
return;
}
int tile_index = shadow_tile_offset(tile_co, tilemaps_buf[tilemap_index].tiles_index, lod);
atomicOr(tiles_buf[tile_index], SHADOW_IS_USED);
}
void shadow_tag_usage(vec3 vP, vec3 P, vec2 pixel)
{
float dist_to_cam = length(vP);
LIGHT_FOREACH_BEGIN_DIRECTIONAL(light_cull_buf, l_idx)
{
shadow_tag_usage_tilemap(l_idx, P, dist_to_cam, true);
}
LIGHT_FOREACH_END
LIGHT_FOREACH_BEGIN_LOCAL(light_cull_buf, light_zbin_buf, light_tile_buf, pixel, vP.z, l_idx)
{
shadow_tag_usage_tilemap(l_idx, P, dist_to_cam, false);
}
LIGHT_FOREACH_END
}

View File

@@ -1,22 +0,0 @@
/**
* Virtual shadowmapping: Usage tagging
*
* Shadow pages are only allocated if they are visible.
* This renders bounding boxes for transparent objects in order to tag the correct shadows.
*/
#pragma BLENDER_REQUIRE(common_view_lib.glsl)
void main()
{
ObjectBounds bounds = bounds_buf[drw_ResourceID];
interp.P = bounds.bounding_corners[0].xyz;
interp.P += bounds.bounding_corners[1].xyz * pos.x;
interp.P += bounds.bounding_corners[2].xyz * pos.y;
interp.P += bounds.bounding_corners[3].xyz * pos.z;
interp.vP = point_world_to_view(interp.P);
gl_Position = point_world_to_ndc(interp.P);
}

View File

@@ -1,398 +0,0 @@
/* Directive for resetting the line numbering so the failing tests lines can be printed.
* This conflict with the shader compiler error logging scheme.
* Comment out for correct compilation error line. */
#line 5
#pragma BLENDER_REQUIRE(gpu_shader_utildefines_lib.glsl)
#pragma BLENDER_REQUIRE(gpu_shader_math_matrix_lib.glsl)
#pragma BLENDER_REQUIRE(gpu_shader_math_vector_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_shadow_lib.glsl)
#pragma BLENDER_REQUIRE(gpu_shader_test_lib.glsl)
#define TEST(a, b) if (true)
void main()
{
TEST(eevee_shadow, DirectionalClipmapLevel)
{
LightData light;
light.type = LIGHT_SUN;
light.clipmap_lod_min = -5;
light.clipmap_lod_max = 8;
light._clipmap_lod_bias = 0.0;
float fac = float(SHADOW_TILEMAP_RES - 1) / float(SHADOW_TILEMAP_RES);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 0.0)), light.clipmap_lod_min);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 0.49)), 1);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 0.5)), 1);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 0.51)), 1);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 0.99)), 2);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 1.0)), 2);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 1.01)), 2);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 12.5)), 6);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 12.51)), 6);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 15.9999)), 6);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 16.0)), 6);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 16.00001)), 6);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 5000.0)), light.clipmap_lod_max);
/* Produces NaN / Inf, Undefined behavior. */
// EXPECT_EQ(shadow_directional_level(light, vec3(FLT_MAX)), light.clipmap_lod_max);
}
TEST(eevee_shadow, DirectionalCascadeLevel)
{
LightData light;
light.type = LIGHT_SUN_ORTHO;
light.clipmap_lod_min = 2;
light.clipmap_lod_max = 8;
float half_size = exp2(float(light.clipmap_lod_min - 1));
light._clipmap_lod_bias = light.clipmap_lod_min - 1;
float fac = float(SHADOW_TILEMAP_RES - 1) / float(SHADOW_TILEMAP_RES);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * half_size * 0.0, 0.0, 0.0)), 2);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * half_size * 0.5, 0.0, 0.0)), 2);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * half_size * 1.0, 0.0, 0.0)), 3);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * half_size * 1.5, 0.0, 0.0)), 3);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * half_size * 2.0, 0.0, 0.0)), 4);
EXPECT_EQ(shadow_directional_level(light, vec3(fac * 5000.0)), light.clipmap_lod_max);
/* Produces NaN / Inf, Undefined behavior. */
// EXPECT_EQ(shadow_directional_level(light, vec3(FLT_MAX)), light.clipmap_lod_max);
}
TEST(eevee_shadow, DirectionalClipmapCoordinates)
{
ShadowCoordinates coords;
vec3 lP, camera_lP;
LightData light;
light.type = LIGHT_SUN;
light.clipmap_lod_min = 0; /* Range [-0.5..0.5]. */
light.clipmap_lod_max = 2; /* Range [-2..2]. */
light.tilemap_index = light.clipmap_lod_min;
light._position = vec3(0.0);
float lod_min_tile_size = exp2(float(light.clipmap_lod_min)) / float(SHADOW_TILEMAP_RES);
float lod_max_half_size = exp2(float(light.clipmap_lod_max)) / 2.0;
camera_lP = vec3(0.0, 0.0, 0.0);
/* Follows ShadowDirectional::end_sync(). */
light.clipmap_base_offset = ivec2(round(camera_lP.xy / lod_min_tile_size));
EXPECT_EQ(light.clipmap_base_offset, ivec2(0));
/* Test UVs and tile mapping. */
lP = vec3(1e-5, 1e-5, 0.0);
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 0);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2));
EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES / 2), 1e-3);
lP = vec3(-1e-5, -1e-5, 0.0);
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 0);
EXPECT_EQ(coords.tile_coord, ivec2((SHADOW_TILEMAP_RES / 2) - 1));
EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES / 2), 1e-3);
lP = vec3(-0.5, -0.5, 0.0); /* Min of first LOD. */
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 0);
EXPECT_EQ(coords.tile_coord, ivec2(0));
EXPECT_NEAR(coords.uv, vec2(0), 1e-3);
lP = vec3(0.5, 0.5, 0.0); /* Max of first LOD. */
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 0);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES - 1));
EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES), 1e-3);
/* Test clipmap level selection. */
camera_lP = vec3(2.0, 2.0, 0.0);
/* Follows ShadowDirectional::end_sync(). */
light.clipmap_base_offset = ivec2(round(camera_lP.xy / lod_min_tile_size));
EXPECT_EQ(light.clipmap_base_offset, ivec2(32));
lP = vec3(2.00001, 2.00001, 0.0);
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 0);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2));
EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES / 2), 1e-3);
lP = vec3(1.50001, 1.50001, 0.0);
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 1);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 4));
EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES / 4), 1e-3);
lP = vec3(1.00001, 1.00001, 0.0);
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 2);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 4));
EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES / 4), 1e-3);
lP = vec3(-0.0001, -0.0001, 0.0); /* Out of bounds. */
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 2);
EXPECT_EQ(coords.tile_coord, ivec2(0));
EXPECT_NEAR(coords.uv, vec2(0), 1e-3);
/* Test clipmap offset. */
light.clipmap_base_offset = ivec2(31, 1);
lP = vec3(2.0001, 0.0001, 0.0);
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2) + ivec2(1, -1));
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2) + ivec2(1, 0));
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2) + ivec2(1, 0));
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2) + ivec2(1, 0));
/* Test clipmap negative offsets. */
light.clipmap_base_offset = ivec2(-31, -1);
lP = vec3(-2.0001, -0.0001, 0.0);
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2 - 1) + ivec2(-1, 1));
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2 - 1) + ivec2(-1, 0));
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2 - 1) + ivec2(-1, 0));
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2 - 1) + ivec2(-1, 0));
}
TEST(eevee_shadow, DirectionalCascadeCoordinates)
{
ShadowCoordinates coords;
vec3 lP, camera_lP;
LightData light;
light.type = LIGHT_SUN_ORTHO;
light.clipmap_lod_min = 0; /* Range [-0.5..0.5]. */
light.clipmap_lod_max = 2; /* 3 tilemaps. */
light.tilemap_index = 1;
light._position = vec3(0.0);
light._clipmap_lod_bias = light.clipmap_lod_min - 1;
light._clipmap_origin_x = 0.0;
light._clipmap_origin_y = 0.0;
float lod_tile_size = exp2(float(light.clipmap_lod_min)) / float(SHADOW_TILEMAP_RES);
float lod_half_size = exp2(float(light.clipmap_lod_min)) / 2.0;
float narrowing = float(SHADOW_TILEMAP_RES - 1) / float(SHADOW_TILEMAP_RES);
camera_lP = vec3(0.0, 0.0, 0.0);
int level_range_size = light.clipmap_lod_max - light.clipmap_lod_min + 1;
vec2 farthest_tilemap_center = vec2(lod_half_size * float(level_range_size - 1), 0.0);
light.clipmap_base_offset = floatBitsToInt(
vec2(lod_half_size / float(level_range_size - 1), 0.0));
/* Test UVs and tile mapping. */
lP = vec3(1e-8, 1e-8, 0.0);
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 1);
EXPECT_EQ(coords.lod_relative, 0);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2));
EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES / 2), 1e-3);
lP = vec3(lod_half_size * narrowing - 1e-5, 1e-8, 0.0);
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 1);
EXPECT_EQ(coords.lod_relative, 0);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES - 1, SHADOW_TILEMAP_RES / 2));
EXPECT_NEAR(coords.uv, vec2(float(SHADOW_TILEMAP_RES) - 0.5, SHADOW_TILEMAP_RES / 2), 1e-3);
lP = vec3(lod_half_size + 1e-5, 1e-5, 0.0);
coords = shadow_directional_coordinates(light, lP);
EXPECT_EQ(coords.tilemap_index, 2);
EXPECT_EQ(coords.lod_relative, 0);
EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES - 1, SHADOW_TILEMAP_RES / 2));
EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES, SHADOW_TILEMAP_RES / 2), 1e-3);
// lP = vec3(-0.5, -0.5, 0.0); /* Min of first LOD. */
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tilemap_index, 0);
// EXPECT_EQ(coords.tile_coord, ivec2(0));
// EXPECT_NEAR(coords.uv, vec2(0), 1e-3);
// lP = vec3(0.5, 0.5, 0.0); /* Max of first LOD. */
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tilemap_index, 0);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES - 1));
// EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES), 1e-3);
/* Test clipmap level selection. */
// camera_lP = vec3(2.0, 2.0, 0.0);
/* Follows ShadowDirectional::end_sync(). */
// light.clipmap_base_offset = ivec2(round(camera_lP.xy / lod_min_tile_size));
// EXPECT_EQ(light.clipmap_base_offset, ivec2(32));
// lP = vec3(2.00001, 2.00001, 0.0);
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tilemap_index, 0);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2));
// EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES / 2), 1e-3);
// lP = vec3(1.50001, 1.50001, 0.0);
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tilemap_index, 1);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 4));
// EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES / 4), 1e-3);
// lP = vec3(1.00001, 1.00001, 0.0);
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tilemap_index, 2);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 4));
// EXPECT_NEAR(coords.uv, vec2(SHADOW_TILEMAP_RES / 4), 1e-3);
// lP = vec3(-0.0001, -0.0001, 0.0); /* Out of bounds. */
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tilemap_index, 2);
// EXPECT_EQ(coords.tile_coord, ivec2(0));
// EXPECT_NEAR(coords.uv, vec2(0), 1e-3);
/* Test clipmap offset. */
// light.clipmap_base_offset = ivec2(31, 1);
// lP = vec3(2.0001, 0.0001, 0.0);
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2) + ivec2(1, -1));
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2) + ivec2(1, 0));
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2) + ivec2(1, 0));
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2) + ivec2(1, 0));
/* Test clipmap negative offsets. */
// light.clipmap_base_offset = ivec2(-31, -1);
// lP = vec3(-2.0001, -0.0001, 0.0);
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2 - 1) + ivec2(-1, 1));
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2 - 1) + ivec2(-1, 0));
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2 - 1) + ivec2(-1, 0));
// coords = shadow_directional_coordinates(light, lP);
// EXPECT_EQ(coords.tile_coord, ivec2(SHADOW_TILEMAP_RES / 2 - 1) + ivec2(-1, 0));
}
TEST(eevee_shadow, DirectionalSlopeBias)
{
float near = 0.0, far = 1.0;
LightData light;
light.type = LIGHT_SUN;
light.clip_near = floatBitsToInt(near);
light.clip_far = floatBitsToInt(far);
light.clipmap_lod_min = 0;
/* Position has no effect for directionnal. */
vec3 lP = vec3(0.0);
vec2 atlas_size = vec2(SHADOW_TILEMAP_RES);
{
vec3 lNg = vec3(0.0, 0.0, 1.0);
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 0), 0.0, 3e-7);
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 1), 0.0, 3e-7);
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 2), 0.0, 3e-7);
}
{
vec3 lNg = normalize(vec3(0.0, 1.0, 1.0));
float expect = 1.0 / (SHADOW_TILEMAP_RES * SHADOW_PAGE_RES);
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 0), expect, 3e-7);
EXPECT_NEAR(
shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 1), expect * 2.0, 3e-7);
EXPECT_NEAR(
shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 2), expect * 4.0, 3e-7);
}
{
vec3 lNg = normalize(vec3(1.0, 1.0, 1.0));
float expect = 2.0 / (SHADOW_TILEMAP_RES * SHADOW_PAGE_RES);
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 0), expect, 3e-7);
EXPECT_NEAR(
shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 1), expect * 2.0, 3e-7);
EXPECT_NEAR(
shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 2), expect * 4.0, 3e-7);
}
light.clipmap_lod_min = -1;
{
vec3 lNg = normalize(vec3(1.0, 1.0, 1.0));
float expect = 0.5 * (2.0 / (SHADOW_TILEMAP_RES * SHADOW_PAGE_RES));
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 0), expect, 3e-7);
EXPECT_NEAR(
shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 1), expect * 2.0, 3e-7);
EXPECT_NEAR(
shadow_slope_bias_get(atlas_size, light, lNg, lP, vec2(0.0), 2), expect * 4.0, 3e-7);
}
}
TEST(eevee_shadow, PunctualSlopeBias)
{
float near = 0.5, far = 1.0;
mat4 pers_mat = projection_perspective(-near, near, -near, near, near, far);
mat4 normal_mat = invert(transpose(pers_mat));
LightData light;
light.clip_near = floatBitsToInt(near);
light.clip_far = floatBitsToInt(far);
light.influence_radius_max = far;
light.type = LIGHT_SPOT;
light.normal_mat_packed.x = normal_mat[3][2];
light.normal_mat_packed.y = normal_mat[3][3];
vec2 atlas_size = vec2(SHADOW_TILEMAP_RES);
{
/* Simulate a "2D" plane crossing the frustum diagonaly. */
vec3 lP0 = vec3(-1.0, 0.0, -1.0);
vec3 lP1 = vec3(0.5, 0.0, -0.5);
vec3 lTg = normalize(lP1 - lP0);
vec3 lNg = vec3(-lTg.z, 0.0, lTg.x);
float expect = 1.0 / (SHADOW_TILEMAP_RES * SHADOW_PAGE_RES);
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP0, vec2(0.0), 0), expect, 1e-4);
EXPECT_NEAR(
shadow_slope_bias_get(atlas_size, light, lNg, lP0, vec2(0.0), 1), expect * 2.0, 1e-4);
EXPECT_NEAR(
shadow_slope_bias_get(atlas_size, light, lNg, lP0, vec2(0.0), 2), expect * 4.0, 1e-4);
}
{
/* Simulate a "2D" plane crossing the near plane at the center diagonaly. */
vec3 lP0 = vec3(-1.0, 0.0, -1.0);
vec3 lP1 = vec3(0.0, 0.0, -0.5);
vec3 lTg = normalize(lP1 - lP0);
vec3 lNg = vec3(-lTg.z, 0.0, lTg.x);
float expect = 2.0 / (SHADOW_TILEMAP_RES * SHADOW_PAGE_RES);
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP0, vec2(0.0), 0), expect, 1e-4);
EXPECT_NEAR(
shadow_slope_bias_get(atlas_size, light, lNg, lP0, vec2(0.0), 1), expect * 2.0, 1e-4);
EXPECT_NEAR(
shadow_slope_bias_get(atlas_size, light, lNg, lP0, vec2(0.0), 2), expect * 4.0, 1e-4);
}
{
/* Simulate a "2D" plane parallel to near clip plane. */
vec3 lP0 = vec3(-1.0, 0.0, -0.75);
vec3 lP1 = vec3(0.0, 0.0, -0.75);
vec3 lTg = normalize(lP1 - lP0);
vec3 lNg = vec3(-lTg.z, 0.0, lTg.x);
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP0, vec2(0.0), 0), 0.0, 1e-4);
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP0, vec2(0.0), 1), 0.0, 1e-4);
EXPECT_NEAR(shadow_slope_bias_get(atlas_size, light, lNg, lP0, vec2(0.0), 2), 0.0, 1e-4);
}
}
}

View File

@@ -1,78 +0,0 @@
/**
* Virtual shadowmapping: Bounds computation for directional shadows.
*
* Iterate through all shadow casters and extract min/max per directional shadow.
* This needs to happen first in the pipeline to allow tagging all relevant tilemap as dirty if
* their range changes.
*/
#pragma BLENDER_REQUIRE(gpu_shader_utildefines_lib.glsl)
#pragma BLENDER_REQUIRE(common_math_lib.glsl)
#pragma BLENDER_REQUIRE(common_intersect_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_shadow_tilemap_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_light_iter_lib.glsl)
shared int global_min;
shared int global_max;
void main()
{
uint index = gl_GlobalInvocationID.x;
/* Keep uniform control flow. Do not return. */
index = min(index, uint(resource_len) - 1);
uint resource_id = casters_id_buf[index];
ObjectBounds bounds = bounds_buf[resource_id];
IsectBox box = isect_data_setup(bounds.bounding_corners[0].xyz,
bounds.bounding_corners[1].xyz,
bounds.bounding_corners[2].xyz,
bounds.bounding_corners[3].xyz);
LIGHT_FOREACH_BEGIN_DIRECTIONAL(light_cull_buf, l_idx)
{
LightData light = light_buf[l_idx];
float local_min = FLT_MAX;
float local_max = -FLT_MAX;
for (int i = 0; i < 8; i++) {
float z = dot(box.corners[i].xyz, light._back);
local_min = min(local_min, z);
local_max = max(local_max, z);
}
if (gl_LocalInvocationID.x == 0) {
global_min = floatBitsToOrderedInt(FLT_MAX);
global_max = floatBitsToOrderedInt(-FLT_MAX);
}
barrier();
/* Quantization bias. */
local_min -= abs(local_min) * 0.01;
local_max += abs(local_max) * 0.01;
/* Intermediate result. Min/Max of a compute group. */
atomicMin(global_min, floatBitsToOrderedInt(local_min));
atomicMax(global_max, floatBitsToOrderedInt(local_max));
barrier();
if (gl_LocalInvocationID.x == 0) {
/* Final result. Min/Max of the whole dispatch. */
atomicMin(light_buf[l_idx].clip_far, global_min);
atomicMax(light_buf[l_idx].clip_near, global_max);
/* TODO(fclem): This feel unecessary but we currently have no indexing from
* tilemap to lights. This is because the lights are selected by culling phase. */
for (int i = light.tilemap_index; i <= light_tilemap_max_get(light); i++) {
int index = tilemaps_buf[i].clip_data_index;
atomicMin(tilemaps_clip_buf[index].clip_far, global_min);
atomicMax(tilemaps_clip_buf[index].clip_near, global_max);
}
}
/* No need for barrier here since global_min/max is only read by thread 0 before being reset by
* thread 0. */
}
LIGHT_FOREACH_END
}

View File

@@ -1,181 +0,0 @@
/**
* Virtual shadowmapping: Tilemap to texture conversion.
*
* For all visible light tilemaps, copy page coordinate to a texture.
* This avoids one level of indirection when evaluating shadows and allows
* to use a sampler instead of a SSBO bind.
*/
#pragma BLENDER_REQUIRE(gpu_shader_utildefines_lib.glsl)
#pragma BLENDER_REQUIRE(common_math_lib.glsl)
#pragma BLENDER_REQUIRE(eevee_shadow_tilemap_lib.glsl)
shared uint tile_updates_count;
shared int view_index;
void page_clear_buf_append(uint page_packed)
{
uint clear_page_index = atomicAdd(clear_dispatch_buf.num_groups_z, 1u);
clear_page_buf[clear_page_index] = page_packed;
}
void page_tag_as_rendered(ivec2 tile_co, int tiles_index, int lod)
{
int tile_index = shadow_tile_offset(tile_co, tiles_index, lod);
tiles_buf[tile_index] |= SHADOW_IS_RENDERED;
atomicAdd(statistics_buf.page_rendered_count, 1);
}
void main()
{
if (all(equal(gl_LocalInvocationID, uvec3(0)))) {
tile_updates_count = uint(0);
}
barrier();
int tilemap_index = int(gl_GlobalInvocationID.z);
ivec2 tile_co = ivec2(gl_GlobalInvocationID.xy);
ivec2 atlas_texel = shadow_tile_coord_in_atlas(tile_co, tilemap_index);
ShadowTileMapData tilemap_data = tilemaps_buf[tilemap_index];
int lod_max = (tilemap_data.projection_type == SHADOW_PROJECTION_CUBEFACE) ? SHADOW_TILEMAP_LOD :
0;
int lod_valid = 0;
/* One bit per lod. */
int do_lod_update = 0;
/* Packed page (packUvec2x16) to render per LOD. */
uint updated_lod_page[SHADOW_TILEMAP_LOD + 1];
uvec2 page_valid;
/* With all threads (LOD0 size dispatch) load each lod tile from the highest lod
* to the lowest, keeping track of the lowest one allocated which will be use for shadowing.
* Also save which page are to be updated. */
for (int lod = SHADOW_TILEMAP_LOD; lod >= 0; lod--) {
if (lod > lod_max) {
updated_lod_page[lod] = 0xFFFFFFFFu;
continue;
}
int tile_index = shadow_tile_offset(tile_co >> lod, tilemap_data.tiles_index, lod);
ShadowTileData tile = shadow_tile_unpack(tiles_buf[tile_index]);
if (tile.is_used && tile.do_update) {
do_lod_update = 1 << lod;
updated_lod_page[lod] = packUvec2x16(tile.page);
}
else {
updated_lod_page[lod] = 0xFFFFFFFFu;
}
/* Save highest lod for this thread. */
if (tile.is_used && lod > 0) {
/* Reload the page in case there was an allocation in the valid thread. */
page_valid = tile.page;
lod_valid = lod;
}
else if (lod == 0 && lod_valid != 0 && !tile.is_allocated) {
/* If the tile is not used, store the valid LOD level in LOD0. */
tile.page = page_valid;
tile.lod = lod_valid;
/* This is not a real ownership. It is just a tag so that the shadowing is deemed correct. */
tile.is_allocated = true;
}
if (lod == 0) {
imageStore(tilemaps_img, atlas_texel, uvec4(shadow_tile_pack(tile)));
}
}
if (do_lod_update > 0) {
atomicAdd(tile_updates_count, 1u);
}
barrier();
if (all(equal(gl_LocalInvocationID, uvec3(0)))) {
/* No update by default. */
view_index = 64;
if (tile_updates_count > 0) {
view_index = atomicAdd(pages_infos_buf.view_count, 1);
if (view_index < 64) {
view_infos_buf[view_index].viewmat = tilemap_data.viewmat;
view_infos_buf[view_index].viewinv = inverse(tilemap_data.viewmat);
if (tilemap_data.projection_type != SHADOW_PROJECTION_CUBEFACE) {
int clip_index = tilemap_data.clip_data_index;
/* For directionnal, we need to modify winmat to encompass all casters. */
float clip_far = -tilemaps_clip_buf[clip_index].clip_far_stored;
float clip_near = -tilemaps_clip_buf[clip_index].clip_near_stored;
tilemap_data.winmat[2][2] = -2.0 / (clip_far - clip_near);
tilemap_data.winmat[3][2] = -(clip_far + clip_near) / (clip_far - clip_near);
}
view_infos_buf[view_index].winmat = tilemap_data.winmat;
view_infos_buf[view_index].wininv = inverse(tilemap_data.winmat);
}
}
}
barrier();
if (view_index < 64) {
ivec3 render_map_texel = ivec3(tile_co, view_index);
/* Store page indirection for rendering. Update every texel in the view array level. */
if (true) {
imageStore(render_map_lod0_img, render_map_texel, uvec4(updated_lod_page[0]));
if (updated_lod_page[0] != 0xFFFFFFFFu) {
page_clear_buf_append(updated_lod_page[0]);
page_tag_as_rendered(render_map_texel.xy, tilemap_data.tiles_index, 0);
}
}
render_map_texel.xy >>= 1;
if (all(equal(tile_co, render_map_texel.xy << 1u))) {
imageStore(render_map_lod1_img, render_map_texel, uvec4(updated_lod_page[1]));
if (updated_lod_page[1] != 0xFFFFFFFFu) {
page_clear_buf_append(updated_lod_page[1]);
page_tag_as_rendered(render_map_texel.xy, tilemap_data.tiles_index, 1);
}
}
render_map_texel.xy >>= 1;
if (all(equal(tile_co, render_map_texel.xy << 2u))) {
imageStore(render_map_lod2_img, render_map_texel, uvec4(updated_lod_page[2]));
if (updated_lod_page[2] != 0xFFFFFFFFu) {
page_clear_buf_append(updated_lod_page[2]);
page_tag_as_rendered(render_map_texel.xy, tilemap_data.tiles_index, 2);
}
}
render_map_texel.xy >>= 1;
if (all(equal(tile_co, render_map_texel.xy << 3u))) {
imageStore(render_map_lod3_img, render_map_texel, uvec4(updated_lod_page[3]));
if (updated_lod_page[3] != 0xFFFFFFFFu) {
page_clear_buf_append(updated_lod_page[3]);
page_tag_as_rendered(render_map_texel.xy, tilemap_data.tiles_index, 3);
}
}
render_map_texel.xy >>= 1;
if (all(equal(tile_co, render_map_texel.xy << 4u))) {
imageStore(render_map_lod4_img, render_map_texel, uvec4(updated_lod_page[4]));
if (updated_lod_page[4] != 0xFFFFFFFFu) {
page_clear_buf_append(updated_lod_page[4]);
page_tag_as_rendered(render_map_texel.xy, tilemap_data.tiles_index, 4);
}
}
render_map_texel.xy >>= 1;
if (all(equal(tile_co, render_map_texel.xy << 5u))) {
imageStore(render_map_lod5_img, render_map_texel, uvec4(updated_lod_page[5]));
if (updated_lod_page[5] != 0xFFFFFFFFu) {
page_clear_buf_append(updated_lod_page[5]);
page_tag_as_rendered(render_map_texel.xy, tilemap_data.tiles_index, 5);
}
}
}
if (all(equal(gl_GlobalInvocationID, uvec3(0)))) {
/* Clamp it as it can underflow if there is too much tile present on screen. */
pages_infos_buf.page_free_count = max(pages_infos_buf.page_free_count, 0);
}
}

Some files were not shown because too many files have changed in this diff Show More