This is a necessary step for EEVEE's new arch. This moves more data
to the draw manager. This makes it easier to have the render or draw
engines manage their own data.
This makes more sense and cleans-up what the GPUViewport holds
Also rewrites the Texture pool manager to be in C++.
This also move the DefaultFramebuffer/TextureList and the engine related
data to a new `DRWViewData` struct. This struct manages the per view
(as in stereo view) engine data.
There is a bit of cleanup in the way the draw manager is setup.
We now use a temporary DRWData instead of creating a dummy viewport.
Development: fclem, jbakker
Differential Revision: https://developer.blender.org/D11966
Goal is to add the length attribute to the Hair Info node, for better control over color gradients or similar along the hair.
Reviewed By: #eevee_viewport, brecht
Differential Revision: https://developer.blender.org/D10481
No functional change.
The shader is complicated by itself, having hardcoded values makes it
even more cryptic.
I also renamed the shader because the shader is not for the keyfarme diamond only,
but for all the keyframe shapes.
Differential Revision: https://developer.blender.org/D12615
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycleshttps://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
This addresses reduced visibility of scenes (as displayed in the VR
headset) that can result from the 8-bit color depth format currently
used for XR swapchain images.
By switching to a swapchain format with higher color depth (RGB10_A2,
RGBA16, RGBA16F) for supported runtimes, visibility in VR should be
noticeably improved.
However, current limitations are lack of support for these higher
color depth formats by some XR runtimes, especially for OpenGL.
Also important to note that GPU_offscreen_create() now explicitly
takes in the texture format (eGPUTextureFormat) instead of a
"high_bitdepth" boolean.
Reviewed By: Julian Eisel, Clément Foucault
Differential Revision: http://developer.blender.org/D9842
ID data-blocks that could be accessed from Python and weren't freed
using BKE_id_free_ex did not release the Python reference count.
Add BKE_libblock_free_data_py function to clear the Python reference
in this case.
Add asserts to ensure no Python reference is held in situations
when ID's are copied for internal use (not exposed through the RNA API),
to ensure these kinds of leaks don't go by unnoticed again.
rBfb87d236edb7 made the values returned by `projmat_dimensions` more
standardized following the documentations. But the functions in Blender
that called `projmat_dimensions` followed a proposal that these values
corresponded to a distance of 1m of clip.
Adjust these functions to follow the new algorithm.
The crash happens because `GPU_offscreen_create` is called with `err_out` `NULL`.
This patch proposes a solution within the `GPU_offscreen_create` itself
and raises an error report in the interface if a menu is called with
dimensions beyond what is supported.
Ref T89782
Maniphest Tasks: T89782
Differential Revision: https://developer.blender.org/D11927
- Added functions to check if the cursor is at a number.
- Added function to parse a number.
- Joined skip_separator functions.
- Added function to check if cursor is at any given set of characters.
Old implementation has a single parser of many different
formats. With the introduction of Vulkan this would lead
to another parser in the same function. This patch
separates the log parsing using a visitor pattern so the
log parsing can be configured per GPU backend or even
per driver.
With Vulkan we manage the compiler our self so the parsing
will become more straight forward. The OpenGL part depends
on many factors (OS, Driver) and perhaps even GPU.
One drawback to trying to predict the number of threads that will be
used in the `task_graph` is that we are only sure of the number when the
threads are running.
Using `BLI_task_parallel_range` allows the driver to
choose the best thread distribution through `parallel_reduce`.
The benefit is most evident on hardware with fewer cores.
This is the result on an 4-core laptop:
||before:|after:
|---|---|---|
|large_mesh_editing:|Average: 5.203638 FPS|Average: 5.398925 FPS
||rdata 15ms iter 43ms (frame 193ms)|rdata 14ms iter 36ms (frame 187ms)
Differential Revision: https://developer.blender.org/D11558
This is an adaptation of {D11488}.
A disadvantage of manually setting the iter ranges per thread is that
we don't know how many threads are running in the background and so we
don't know how to best distribute the ranges.
To solve this limitation we can use `parallel_reduce` and thus let the
driver choose the best distribution of ranges among the threads.
This proved to be especially beneficial for computers with few cores.
**Benchmarking:**
Here's the result on an 4-core laptop:
||master:|PATCH:
|---|---|---|
|large_mesh_editing:|Average: 5.203638 FPS|Average: 5.398925 FPS
||rdata 15ms iter 43ms (frame 193ms)|rdata 14ms iter 36ms (frame 187ms)
Here's the result on an 8-core PC:
||master:|PATCH:
|---|---|---|
|large_mesh_editing:|Average: 15.267482 FPS|Average: 15.906881 FPS
||rdata 9ms iter 28ms (frame 65ms)|rdata 9ms iter 25ms (frame 63ms)
|large_mesh_editing_ledge: |Average: 15.145966 FPS|Average: 15.520474 FPS
||rdata 9ms iter 29ms (frame 65ms)|rdata 9ms iter 25ms (frame 64ms)
|looptris_test:|Average: 4.001917 FPS|Average: 4.061105 FPS
||rdata 12ms iter 90ms (frame 236ms)|rdata 12ms iter 87ms (frame 230ms)
|subdiv_mesh_cage_and_final:|Average: 1.917769 FPS|Average: 1.971790 FPS
||rdata 7ms iter 37ms (frame 261ms)|rdata 7ms iter 31ms (frame 258ms)
||rdata 7ms iter 38ms (frame 252ms)|rdata 7ms iter 33ms (frame 249ms)
|subdiv_mesh_final_only:|Average: 6.387240 FPS|Average: 6.591251 FPS
||rdata 3ms iter 25ms (frame 151ms)|rdata 3ms iter 16ms (frame 145ms)
|subdiv_mesh_final_only_ledge:|Average: 6.247393 FPS|Average: 6.596024 FPS
||rdata 3ms iter 26ms (frame 158ms)|rdata 3ms iter 16ms (frame 148ms)
**Notes:**
- The improvement can only be noticed if all extracts are multithreaded.
- This patch touches different areas of the code, so it can be split into another patch if the idea is accepted.
These screenshots show how threads behave in a quadcore:
Master:
{F10164664}
Patch:
{F10164666}
Differential Revision: https://developer.blender.org/D11558
Current index builder is designed to be used in a single thread.
This makes all index buffer extractions single threaded.
This patch adds a thread safe solution enabling multithreaded
building of index buffers.
To reduce locking the solution would provide a task/thread local
index buffer builder (called sub builder).
When a thread is finished this thread local index buffer builder
can be joined with the initial index buffer builder.
`GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The
index list is shared between the parent and sub buffer, but the
counters are localized. Ensuring that updating counters would
not need any locking.
`GPU_indexbuf_subbuilder_finish`: merge the information of the
sub builder back to the parent builder. Needs to be invoked outside
the worker thread, or when sure that all worker threads have been
finished. Internal the function is not thread safe.
For testing purposes the extract_points extractor has been migrated to
the new API. Herefore changes to the mesh extractor were needed.
* When creating tasks, the task number of current task is stored in
ExtractTaskData including the total number of tasks.
* Adding two functions in `MeshExtract`.
** `task_init` will initialize the task specific userdata.
** `task_finish` should merge back the task specific userdata back.
* adding task_id parameter to the iteration functions so they can
access the correct task data without any need for locking.
There is no noticeable change in end user performance.
Reviewed By: mano-wii
Differential Revision: https://developer.blender.org/D11499
Moving the bounds code to the builder can be useful
for future optimizations like building multithreaded.
Reviewed By: fclem, jbakker
Differential Revision: https://developer.blender.org/D11455
When projecting into screen space Z value isn't always needed.
Add 2D projection functions, renaming them to avoid accidents
happening again.
- Add GPU_matrix_project_2fv
- Add ED_view3d_project_v2
- Rename ED_view3d_project to ED_view3d_project_v3
- Use the 2D versions of these functions when the Z value isn't used.
This patch will use compute shaders to create the VBO for hair.
The previous implementation uses transform feedback.
Timings before: between 0.000069s and 0.000362s.
Timings after: between 0.000032s and 0.000092s.
Speedup isn't noticeable by end-users. The patch is used to test
the new compute shader pipeline and integrate it with the draw
manager. Allowing EEVEE, Workbench and other draw engines to
use compute shaders with the introduction of `DRW_shgroup_call_compute`
and `DRW_shgroup_vertex_buffer`.
Future improvements are possible by generating the index buffer
of hair directly on the GPU.
NOTE: that compute shaders aren't supported by Apple and still use
the transform feedback workaround.
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D11057
This reverts commit 8f9599d17e.
Mac seems to have an error with this change.
```
ERROR: /Users/blender/git/blender-vdev/blender.git/source/blender/draw/intern/draw_hair.c:115:44: error: use of undeclared identifier 'shader_src'
ERROR: /Users/blender/git/blender-vdev/blender.git/source/blender/draw/intern/draw_hair.c:123:13: error: use of undeclared identifier 'shader_src'
ERROR: make[2]: *** [source/blender/draw/CMakeFiles/bf_draw.dir/intern/draw_hair.c.o] Error 1
ERROR: make[1]: *** [source/blender/draw/CMakeFiles/bf_draw.dir/all] Error 2
ERROR: make: *** [all] Error 2
```
This patch will use compute shaders to create the VBO for hair.
The previous implementation uses tranform feedback.
Timings master (transform feedback with GPU_USAGE_STATIC between 0.000069s and 0.000362s
Timings transform feedback with GPU_USAGE_DEVICE_ONLY. between 0.000057s and 0.000122s
Timings compute shader between 0.000032 and 0.000092s
Future improvements:
* Generate hair Index buffer using compute shaders: currently done single threaded on CPU, easy to add as compute shader.
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D11057
With the compute pipeline calculation can be offloaded to the GPU.
This patch only adds the framework for compute. So no changes for users at
this moment.
NOTE: As this is an OpenGL4.3 feature it must always have a fallback.
Use `GPU_compute_shader_support` to check if compute pipeline can be used.
Check `gpu_shader_compute*` test cases for usage.
This patch also adds support for shader storage buffer objects and device only
vertex/index buffers.
An alternative that had been discussed was adding this to the `GPUBatch`, this
was eventually not chosen as it would lead to more code when used as part of a
shading group. The idea is that we add an `eDRWCommandType` in the near
future.
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D10913
This patch adds wavelength node support to Eevee, similar to how
Eevee Blackbody node works, thus it is a little off from Cycles.
Reviewed By: #eevee_viewport, fclem, brecht
Differential Revision: https://developer.blender.org/D11326
This module exposes the platform utils defined in the GPU module in C.
This will be useful for porting existing code with `bgl` to `gpu`.
Reviewed By: fclem, brecht, campbellbarton
Maniphest Tasks: T80730
Part of D11147
This module exposes the capabilities defined in the GPU module in C.
This will be useful for porting existing code in `bgl` to `gpu`.
Reviewed By: fclem, brecht, campbellbarton
Maniphest Tasks: T80730
Part of D11147