This repository has been archived on 2023-10-09. You can view files and clone it, but cannot push or open issues or pull requests.
Files
blender-archive/source/blender/gpu/GPU_index_buffer.h

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

132 lines
4.7 KiB
C++
Raw Normal View History

/*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software Foundation,
* Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
*
* The Original Code is Copyright (C) 2016 by Mike Erwin.
* All rights reserved.
*/
/** \file
* \ingroup gpu
*
* GPU index buffer
*/
#pragma once
#include "GPU_primitive.h"
#ifdef __cplusplus
extern "C" {
#endif
/** Opaque type hiding blender::gpu::IndexBuf. */
typedef struct GPUIndexBuf GPUIndexBuf;
GPUIndexBuf *GPU_indexbuf_calloc(void);
typedef struct GPUIndexBufBuilder {
uint max_allowed_index;
uint max_index_len;
uint index_len;
uint index_min;
uint index_max;
GPUPrimType prim_type;
uint32_t *data;
} GPUIndexBufBuilder;
/* supports all primitive types. */
void GPU_indexbuf_init_ex(GPUIndexBufBuilder *, GPUPrimType, uint index_len, uint vertex_len);
/* supports only GPU_PRIM_POINTS, GPU_PRIM_LINES and GPU_PRIM_TRIS. */
2018-07-18 23:09:31 +10:00
void GPU_indexbuf_init(GPUIndexBufBuilder *, GPUPrimType, uint prim_len, uint vertex_len);
GPUIndexBuf *GPU_indexbuf_build_on_device(uint index_len);
OpenSubDiv: add support for an OpenGL evaluator This evaluator is used in order to evaluate subdivision at render time, allowing for faster renders of meshes with a subdivision surface modifier placed at the last position in the modifier list. When evaluating the subsurf modifier, we detect whether we can delegate evaluation to the draw code. If so, the subdivision is first evaluated on the GPU using our own custom evaluator (only the coarse data needs to be initially sent to the GPU), then, buffers for the final `MeshBufferCache` are filled on the GPU using a set of compute shaders. However, some buffers are still filled on the CPU side, if doing so on the GPU is impractical (e.g. the line adjacency buffer used for x-ray, whose logic is hardly GPU compatible). This is done at the mesh buffer extraction level so that the result can be readily used in the various OpenGL engines, without having to write custom geometry or tesselation shaders. We use our own subdivision evaluation shaders, instead of OpenSubDiv's vanilla one, in order to control the data layout, and interpolation. For example, we store vertex colors as compressed 16-bit integers, while OpenSubDiv's default evaluator only work for float types. In order to still access the modified geometry on the CPU side, for use in modifiers or transform operators, a dedicated wrapper type is added `MESH_WRAPPER_TYPE_SUBD`. Subdivision will be lazily evaluated via `BKE_object_get_evaluated_mesh` which will create such a wrapper if possible. If the final subdivision surface is not needed on the CPU side, `BKE_object_get_evaluated_mesh_no_subsurf` should be used. Enabling or disabling GPU subdivision can be done through the user preferences (under Viewport -> Subdivision). See patch description for benchmarks. Reviewed By: campbellbarton, jbakker, fclem, brecht, #eevee_viewport Differential Revision: https://developer.blender.org/D12406
2021-12-27 16:34:47 +01:00
void GPU_indexbuf_init_build_on_device(GPUIndexBuf *elem, uint index_len);
GPU: Thread safe index buffer builders. Current index builder is designed to be used in a single thread. This makes all index buffer extractions single threaded. This patch adds a thread safe solution enabling multithreaded building of index buffers. To reduce locking the solution would provide a task/thread local index buffer builder (called sub builder). When a thread is finished this thread local index buffer builder can be joined with the initial index buffer builder. `GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The index list is shared between the parent and sub buffer, but the counters are localized. Ensuring that updating counters would not need any locking. `GPU_indexbuf_subbuilder_finish`: merge the information of the sub builder back to the parent builder. Needs to be invoked outside the worker thread, or when sure that all worker threads have been finished. Internal the function is not thread safe. For testing purposes the extract_points extractor has been migrated to the new API. Herefore changes to the mesh extractor were needed. * When creating tasks, the task number of current task is stored in ExtractTaskData including the total number of tasks. * Adding two functions in `MeshExtract`. ** `task_init` will initialize the task specific userdata. ** `task_finish` should merge back the task specific userdata back. * adding task_id parameter to the iteration functions so they can access the correct task data without any need for locking. There is no noticeable change in end user performance. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11499
2021-06-08 16:35:33 +02:00
/*
Refactor: Draw Cache: use 'BLI_task_parallel_range' This is an adaptation of {D11488}. A disadvantage of manually setting the iter ranges per thread is that we don't know how many threads are running in the background and so we don't know how to best distribute the ranges. To solve this limitation we can use `parallel_reduce` and thus let the driver choose the best distribution of ranges among the threads. This proved to be especially beneficial for computers with few cores. **Benchmarking:** Here's the result on an 4-core laptop: ||master:|PATCH: |---|---|---| |large_mesh_editing:|Average: 5.203638 FPS|Average: 5.398925 FPS ||rdata 15ms iter 43ms (frame 193ms)|rdata 14ms iter 36ms (frame 187ms) Here's the result on an 8-core PC: ||master:|PATCH: |---|---|---| |large_mesh_editing:|Average: 15.267482 FPS|Average: 15.906881 FPS ||rdata 9ms iter 28ms (frame 65ms)|rdata 9ms iter 25ms (frame 63ms) |large_mesh_editing_ledge: |Average: 15.145966 FPS|Average: 15.520474 FPS ||rdata 9ms iter 29ms (frame 65ms)|rdata 9ms iter 25ms (frame 64ms) |looptris_test:|Average: 4.001917 FPS|Average: 4.061105 FPS ||rdata 12ms iter 90ms (frame 236ms)|rdata 12ms iter 87ms (frame 230ms) |subdiv_mesh_cage_and_final:|Average: 1.917769 FPS|Average: 1.971790 FPS ||rdata 7ms iter 37ms (frame 261ms)|rdata 7ms iter 31ms (frame 258ms) ||rdata 7ms iter 38ms (frame 252ms)|rdata 7ms iter 33ms (frame 249ms) |subdiv_mesh_final_only:|Average: 6.387240 FPS|Average: 6.591251 FPS ||rdata 3ms iter 25ms (frame 151ms)|rdata 3ms iter 16ms (frame 145ms) |subdiv_mesh_final_only_ledge:|Average: 6.247393 FPS|Average: 6.596024 FPS ||rdata 3ms iter 26ms (frame 158ms)|rdata 3ms iter 16ms (frame 148ms) **Notes:** - The improvement can only be noticed if all extracts are multithreaded. - This patch touches different areas of the code, so it can be split into another patch if the idea is accepted. These screenshots show how threads behave in a quadcore: Master: {F10164664} Patch: {F10164666} Differential Revision: https://developer.blender.org/D11558
2021-06-10 11:01:36 -03:00
* Thread safe.
GPU: Thread safe index buffer builders. Current index builder is designed to be used in a single thread. This makes all index buffer extractions single threaded. This patch adds a thread safe solution enabling multithreaded building of index buffers. To reduce locking the solution would provide a task/thread local index buffer builder (called sub builder). When a thread is finished this thread local index buffer builder can be joined with the initial index buffer builder. `GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The index list is shared between the parent and sub buffer, but the counters are localized. Ensuring that updating counters would not need any locking. `GPU_indexbuf_subbuilder_finish`: merge the information of the sub builder back to the parent builder. Needs to be invoked outside the worker thread, or when sure that all worker threads have been finished. Internal the function is not thread safe. For testing purposes the extract_points extractor has been migrated to the new API. Herefore changes to the mesh extractor were needed. * When creating tasks, the task number of current task is stored in ExtractTaskData including the total number of tasks. * Adding two functions in `MeshExtract`. ** `task_init` will initialize the task specific userdata. ** `task_finish` should merge back the task specific userdata back. * adding task_id parameter to the iteration functions so they can access the correct task data without any need for locking. There is no noticeable change in end user performance. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11499
2021-06-08 16:35:33 +02:00
*
2021-10-20 09:19:21 +11:00
* Function inspired by the reduction directives of multi-thread work API's.
GPU: Thread safe index buffer builders. Current index builder is designed to be used in a single thread. This makes all index buffer extractions single threaded. This patch adds a thread safe solution enabling multithreaded building of index buffers. To reduce locking the solution would provide a task/thread local index buffer builder (called sub builder). When a thread is finished this thread local index buffer builder can be joined with the initial index buffer builder. `GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The index list is shared between the parent and sub buffer, but the counters are localized. Ensuring that updating counters would not need any locking. `GPU_indexbuf_subbuilder_finish`: merge the information of the sub builder back to the parent builder. Needs to be invoked outside the worker thread, or when sure that all worker threads have been finished. Internal the function is not thread safe. For testing purposes the extract_points extractor has been migrated to the new API. Herefore changes to the mesh extractor were needed. * When creating tasks, the task number of current task is stored in ExtractTaskData including the total number of tasks. * Adding two functions in `MeshExtract`. ** `task_init` will initialize the task specific userdata. ** `task_finish` should merge back the task specific userdata back. * adding task_id parameter to the iteration functions so they can access the correct task data without any need for locking. There is no noticeable change in end user performance. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11499
2021-06-08 16:35:33 +02:00
*/
2021-06-18 14:27:43 +10:00
void GPU_indexbuf_join(GPUIndexBufBuilder *builder, const GPUIndexBufBuilder *builder_from);
GPU: Thread safe index buffer builders. Current index builder is designed to be used in a single thread. This makes all index buffer extractions single threaded. This patch adds a thread safe solution enabling multithreaded building of index buffers. To reduce locking the solution would provide a task/thread local index buffer builder (called sub builder). When a thread is finished this thread local index buffer builder can be joined with the initial index buffer builder. `GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The index list is shared between the parent and sub buffer, but the counters are localized. Ensuring that updating counters would not need any locking. `GPU_indexbuf_subbuilder_finish`: merge the information of the sub builder back to the parent builder. Needs to be invoked outside the worker thread, or when sure that all worker threads have been finished. Internal the function is not thread safe. For testing purposes the extract_points extractor has been migrated to the new API. Herefore changes to the mesh extractor were needed. * When creating tasks, the task number of current task is stored in ExtractTaskData including the total number of tasks. * Adding two functions in `MeshExtract`. ** `task_init` will initialize the task specific userdata. ** `task_finish` should merge back the task specific userdata back. * adding task_id parameter to the iteration functions so they can access the correct task data without any need for locking. There is no noticeable change in end user performance. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11499
2021-06-08 16:35:33 +02:00
2018-07-18 23:09:31 +10:00
void GPU_indexbuf_add_generic_vert(GPUIndexBufBuilder *, uint v);
void GPU_indexbuf_add_primitive_restart(GPUIndexBufBuilder *);
2018-07-18 23:09:31 +10:00
void GPU_indexbuf_add_point_vert(GPUIndexBufBuilder *, uint v);
void GPU_indexbuf_add_line_verts(GPUIndexBufBuilder *, uint v1, uint v2);
void GPU_indexbuf_add_tri_verts(GPUIndexBufBuilder *, uint v1, uint v2, uint v3);
void GPU_indexbuf_add_line_adj_verts(GPUIndexBufBuilder *, uint v1, uint v2, uint v3, uint v4);
void GPU_indexbuf_set_point_vert(GPUIndexBufBuilder *builder, uint elem, uint v1);
void GPU_indexbuf_set_line_verts(GPUIndexBufBuilder *builder, uint elem, uint v1, uint v2);
void GPU_indexbuf_set_tri_verts(GPUIndexBufBuilder *builder, uint elem, uint v1, uint v2, uint v3);
/* Skip primitive rendering at the given index. */
void GPU_indexbuf_set_point_restart(GPUIndexBufBuilder *builder, uint elem);
void GPU_indexbuf_set_line_restart(GPUIndexBufBuilder *builder, uint elem);
void GPU_indexbuf_set_tri_restart(GPUIndexBufBuilder *builder, uint elem);
2018-07-18 23:09:31 +10:00
GPUIndexBuf *GPU_indexbuf_build(GPUIndexBufBuilder *);
void GPU_indexbuf_build_in_place(GPUIndexBufBuilder *, GPUIndexBuf *);
void GPU_indexbuf_bind_as_ssbo(GPUIndexBuf *elem, int binding);
OpenSubDiv: add support for an OpenGL evaluator This evaluator is used in order to evaluate subdivision at render time, allowing for faster renders of meshes with a subdivision surface modifier placed at the last position in the modifier list. When evaluating the subsurf modifier, we detect whether we can delegate evaluation to the draw code. If so, the subdivision is first evaluated on the GPU using our own custom evaluator (only the coarse data needs to be initially sent to the GPU), then, buffers for the final `MeshBufferCache` are filled on the GPU using a set of compute shaders. However, some buffers are still filled on the CPU side, if doing so on the GPU is impractical (e.g. the line adjacency buffer used for x-ray, whose logic is hardly GPU compatible). This is done at the mesh buffer extraction level so that the result can be readily used in the various OpenGL engines, without having to write custom geometry or tesselation shaders. We use our own subdivision evaluation shaders, instead of OpenSubDiv's vanilla one, in order to control the data layout, and interpolation. For example, we store vertex colors as compressed 16-bit integers, while OpenSubDiv's default evaluator only work for float types. In order to still access the modified geometry on the CPU side, for use in modifiers or transform operators, a dedicated wrapper type is added `MESH_WRAPPER_TYPE_SUBD`. Subdivision will be lazily evaluated via `BKE_object_get_evaluated_mesh` which will create such a wrapper if possible. If the final subdivision surface is not needed on the CPU side, `BKE_object_get_evaluated_mesh_no_subsurf` should be used. Enabling or disabling GPU subdivision can be done through the user preferences (under Viewport -> Subdivision). See patch description for benchmarks. Reviewed By: campbellbarton, jbakker, fclem, brecht, #eevee_viewport Differential Revision: https://developer.blender.org/D12406
2021-12-27 16:34:47 +01:00
/* Upload data to the GPU (if not built on the device) and bind the buffer to its default target.
*/
void GPU_indexbuf_use(GPUIndexBuf *elem);
/* Partially update the GPUIndexBuf which was already sent to the device, or built directly on the
* device. The data needs to be compatible with potential compression applied to the original
* indices when the index buffer was built, i.e., if the data was compressed to use shorts instead
* of ints, shorts should passed here. */
void GPU_indexbuf_update_sub(GPUIndexBuf *elem, uint start, uint len, const void *data);
2021-02-05 16:23:34 +11:00
/* Create a sub-range of an existing index-buffer. */
GPUIndexBuf *GPU_indexbuf_create_subrange(GPUIndexBuf *elem_src, uint start, uint length);
void GPU_indexbuf_create_subrange_in_place(GPUIndexBuf *elem,
GPUIndexBuf *elem_src,
uint start,
uint length);
/**
* (Download and) return a pointer containing the data of an index buffer.
*
* Note that the returned pointer is still owned by the driver. To get an
* local copy, use `GPU_indexbuf_unmap` after calling `GPU_indexbuf_read`.
*/
const uint32_t *GPU_indexbuf_read(GPUIndexBuf *elem);
2021-05-27 10:46:43 +02:00
uint32_t *GPU_indexbuf_unmap(const GPUIndexBuf *elem, const uint32_t *mapped_buffer);
2020-09-07 13:59:22 +02:00
void GPU_indexbuf_discard(GPUIndexBuf *elem);
2020-09-07 13:59:22 +02:00
bool GPU_indexbuf_is_init(GPUIndexBuf *elem);
int GPU_indexbuf_primitive_len(GPUPrimType prim_type);
/* Macros */
#define GPU_INDEXBUF_DISCARD_SAFE(elem) \
do { \
if (elem != NULL) { \
GPU_indexbuf_discard(elem); \
elem = NULL; \
} \
} while (0)
#ifdef __cplusplus
}
#endif