This refactor modernise the use of framebuffers.
It also touches a lot of files so breaking down changes we have:
- GPUTexture: Allow textures to be attached to more than one GPUFrameBuffer.
This allows to create and configure more FBO without the need to attach
and detach texture at drawing time.
- GPUFrameBuffer: The wrapper starts to mimic opengl a bit closer. This
allows to configure the framebuffer inside a context other than the one
that will be rendering the framebuffer. We do the actual configuration
when binding the FBO. We also Keep track of config validity and save
drawbuffers state in the FBO. We remove the different bind/unbind
functions. These make little sense now that we have separate contexts.
- DRWFrameBuffer: We replace DRW_framebuffer functions by GPU_framebuffer
ones to avoid another layer of abstraction. We move the DRW convenience
functions to GPUFramebuffer instead and even add new ones. The MACRO
GPU_framebuffer_ensure_config is pretty much all you need to create and
config a GPUFramebuffer.
- DRWTexture: Due to the removal of DRWFrameBuffer, we needed to create
functions to create textures for thoses framebuffers. Pool textures are
now using default texture parameters for the texture type asked.
- DRWManager: Make sure no framebuffer object is bound when doing cache
filling.
- GPUViewport: Add new color_only_fb and depth_only_fb along with FB API
usage update. This let draw engines render to color/depth only target
and without the need to attach/detach textures.
- WM_window: Assert when a framebuffer is bound when changing context.
This balance the fact we are not track ogl context inside GPUFramebuffer.
- Eevee, Clay, Mode engines: Update to new API. This comes with a lot of
code simplification.
This also come with some cleanups in some engine codes.
Previous approach was not clear enough and caused problems.
UBOs were taking slots and not release them after a shading group even
if this UBO was only for this Shading Group (notably the nodetree ubo,
since we now share the same GPUShader for identical trees).
So I choose to have a better defined approach:
- Standard texture and ubo calls are assured to be valid for the shgrp
they are called from.
- (new) Persistent texture and ubo calls are assured to be valid accross
shgrps unless the shader changes.
The standards calls are still valids for the next shgrp but are not assured
to be so if this new shgrp binds a new texture.
This enables some optimisations by not adding redundant texture and ubo
binds.
Calling the rendering operator seems to kill any other WM_job running, leaving
uncompiled materials into a GPU_MAT_QUEUED state. This then made the probe update
looping indefinitely (all_materials_updated remaining to false).
To fix this, we resume compilation for materials that are in this state.
Cancelling Render before all material compilation could make certain material
remain uncompiled. Fortunately, this is not allowed as of now.
This leads to less lookups to the GWNShaderInterface and less uniform upload.
We still keep a legacy path so that Builtin uniforms can still work. We might restrict this path to Builtin shader only in the future.
Instead of creating a new instancing shading group without attrib, we now have instancing calls. The benefits is that they can be culled.
They can be used in conjuction with the standard and generate calls but shader must support it (which is generally not the case).
We store a pointer to the actual count so that the number can be tweaked between redraw.
This will makes multi layer rendering more efficient.
Selection code relies on being able to set the depth functions
however passes have their own depth settings.
Add DRW_state_lock to ignore passes settings for particular flags.
This fixes occlusion queries cycling through objects under the cursor.
Refactor include:
- Removal of DRWInterface. (was useless)
- Split DRWCallHeader into a new struct DRWCallState that will be reused in the future.
- Use BLI_link_utils for APPEND/PREPEND.
- Creation of the new DRWManager struct type. This will enable us to create more than one manager in the future.
- Removal of some dead code.
User notes
----------
Compositing, rendering of multi-layers in Eevee should be fully working now.
Development notes
-----------------
Up until now we were still using the same depsgraph for rendering and viewport
evaluation. And we had to go out of our ways to be sure the depsgraphs were
updated.
Now we iterate over the (to be rendered) view layers and create a depsgraph to
each one, fully evaluated and call the render engines (Cycles, Eevee, ...) with
this viewlayer/depsgraph/evaluation context.
At this time we are not handling data persistency, Depsgraph is created from
scratch prior to rendering each frame. So I got rid of most of the partial
update calls we had during the render pipeline.
Cycles: Brecht Van Lommel did a patch to tackle some of the required Cycles
changes but this commit mark these changes as TODOs. Basically Cycles needs to
render one layer at a time.
Reviewers: sergey, brecht
Differential Revision: https://developer.blender.org/D3073
Technically the original issue is that xof/yof in render result is calculated
for drawing border render. So a simpler patch could be:
```
- rr->xof = re->disprect.xmin;
+ rr->xof = re->disprect.xmin + BLI_rcti_cent_x(&re->disprect) - (re->winx / 2);
```
However everywhere in the code we are getting border directly from re->disprect
which we may as well do here too.
Besides I'm taking this as a chance to get rid of RenderResult in the internal
loop of eevee, to help prepare the code to the upcoming rendering pipeline
changes.
A major bottleneck of current implementation is the call to create_bindings() for basically every drawcalls.
This is due to the VAO being tagged dirty when assigning a new shader to the Batch, defeating the purpose of the Batch (reuse it for drawing).
Since managing hundreds of batches in DrawManager and DrawCache seems not fun enough to me, I prefered rewritting the batches itself.
--- Batch changes ---
For this to happen I needed to change the Instancing to be part of the Batch rather than being another batch supplied at drawtime.
The Gwn_VertBuffers are copied from the batch to be instanciated and a new Gwn_VertBuffer is supplied for instancing attribs.
This mean a VAO can be generated and cached for this instancing case.
A Batch can be rendered with instancing, without instancing attribs and without the need for a new VAO using the GWN_batch_draw_range_ex with the force_instance parameter set to true.
--- Draw manager changes ---
The downside with this approach is that we must track the validity of the instanced batch (the original one). For this the only way (I could think of) is to set a callback for when the batch is getting free.
This means a bit of refactor in the DrawManager with the separation of batching and instancing Batches.
--- VAO cache ---
Each VAO is generated for a given ShaderInterface. This means we can keep it alive as long as the shader interface lives.
If a ShaderInterface is discarded, it needs to destroy every VAO associated to it. Otherwise, a new ShaderInterface with the same adress could be generated and reuse the same VAO with incorrect bindings.
The VAO cache itself is using a mix between a static array of VAO and a dynamic array if the is not enough space in the static.
Using this hybrid approach is a bit more performant than the dynamic array alone.
The array will not resize down but empty entries will be filled up again. It's unlikely we get a buffer overflow from this. Resizing could be done on next allocation if needed.
--- Results ---
Using Cached VAOs means that we are not querying each vertex attrib for each vbo for each drawcall, every redraw!
In a CPU limited test scene (10000 cubes in Clay engine) I get a reduction of CPU drawing time from ~20ms to 13ms.
The only area that is not caching VAOs is the instancing from particles (see comment DRW_shgroup_instance_batch).
We need to move the render result logic outside the render engine code.
It makes no sense for Eevee/Clay/... to have to re-implement the render resilt
creation logic. Beside the original implementation really got it wrong, by
ignoring the different render layers needed for the final render.
Finally, there is no need to re-create the logic for views. So this was also
fixed.
Note 1: This will break still if the depsgraph of the needed view layers is not
updated / created. We need to address this separately. For now if users want
to test this, just show each view layer in the viewport at least once.
Note 2: We are still getting depsgraph from scene and creating if needed.
`BKE_scene_get_depsgraph(scene, view_layer, true);` according to Sergey we need
to move the render depsgraph for the Render struct instead. I will do it
separately as well.
This removes the need of custom attribs for instancing.
Instancing works fully with dynamic batches & Gwn_VertFormat now.
This is in prevision of the VAO manager patch.
This manager allows to distribute existing batches for instancing
attributes. This reduce the number of batches creation.
Querying a batch is done with a vertex format. This format should
be static so that it's pointer never changes (because we are using
this pointer as identifier [we don't want to check the full format
that would be too slow]).
This might make the original Instance Data manager useless but it's currently used by DRW_object_engine_data_ensure().
For simplicity we choose to execute the rendering of Opengl engines in the main thread and block the interface.
This might be addressed in the future at least for video rendering.
A drawmanager wrapper (DRW_render_to_image) is called by the render pipeline to set up the Opengl state and then call the specific draw_engine->render_to_image function.
Main idea is to make specific engine types be a subclass of generic
ObjectEngineData structure.
This required following changes:
- Have extra size argument to engine data allocation function.
Not sure whether there is less error-prone way of doing this.
- Add init() callback to engine data allocation function.
Additionally, added some extra checks to Eevee's engine data getters, so we do
not silently cast lamp data to lightprobe data.
Reviewers: dfelinto, fclem
Differential Revision: https://developer.blender.org/D3027
This optimisation only works if no material in the scene require the AO pass.
For this either set the AO distance to 0 or both Cavity and Edges factors to 0.
This double the performance of scenes with very high triangle count.
This is because certain part of the engine may require a blank framebuffer to bind textures to.
This is the case when using only array textures, unsupported by DRW_framebuffer_init().
This allows a duplicator (as known as dupli parent) to be in a visible
collection so its duplicated objects are visible, however while being
invisible for the final render.
An object that is a particle emitter is also considered a duplicator.
Many thanks for the reviewers for the extense feedback.
Reviewers: sergey, campbellbarton
Differential Revision: https://developer.blender.org/D2966
I had to make Eevee draw its scene in the scene pass (before it was doing it
in the background pass). This is not ideal since reference images require
a separation between scene and background.
But it's the best way to solve it now. Clay is working fine.
This replaces dedicated flag which wasn't clean who sets it and who clears it,
and which was also trying to re-implement existing functionality in a way.
Flushing is not currently very efficient but there are ways to speed this up
a lot, but needs more investigation.
The RenderResult struct still has a listbase of RenderLayer, but that's ok
since this is strictly for rendering.
* Subversion bump (to 2.80.2)
* DNA low level doversion (renames) - only for .blend created since 2.80 started
Note: We can't use DNA_struct_elem_find or get file version in init_structDNA,
so we are manually iterating over the array of the SDNA elements instead.
Note 2: This doversion change with renames can be reverted in a few months. But
so far it's required for 2.8 files created between October 2016 and now.
Reviewers: campbellbarton, sergey
Differential Revision: https://developer.blender.org/D2927