For this we use a new shader that gets it's data from a uniform array.
Vertex shader position the vertices using these data.
Using glUniform is way faster than using imm for that matter.
Like BLF rendering, UI icons are always (as far as I know) non occluded and
displayed above everything else. They also does not overlap with texts so
they can be batched at the same time.
I've made a separate version of the geom shader that works with full
3D modelviewmat.
This commit also includes some fixup inside blf_batching_start().
This means smaller imm buffer usage.
This does not reduce the number of drawcalls.
This uses geometry shader which is slow for the GPU but given we are really
CPU bound on this case, it should not matter.
A perfect implementation would:
- Set the glyph coord in a bufferTexture and just send the glyph ID to the
GPU to read the bufferTexture.
- Use GWN_draw_primitive and draw 2*strllen triangle and just retrieve the
glyph ID and color based on gl_VertexID / 6.
- Stream fixed size buffer that the Driver can discard quickly but this is
the same as improving IMM directly.
Introduce a UI batch cache. For the moment it's only used by widgetbase so
leaving it interface_widgets.c. If it grows, it can have its own file.
Like all preset batches (batches used by UI context), vaos must be refreshed
each time a new window context is binded.
This still does 3 GWN_batch_draw in the worst cases but at least it does
not use the IMM api.
I will continue and batch the 3 calls together since we are really CPU
bound, so shader complexity does not really matters.
I cannot spot any difference on all the widgets I could test. I did not use
any unit tests so I cannot tell if there is really any defects.
This is not a complete rewrite but it adresses the top bottleneck found
after a profilling session.
This refactor modernise the use of framebuffers.
It also touches a lot of files so breaking down changes we have:
- GPUTexture: Allow textures to be attached to more than one GPUFrameBuffer.
This allows to create and configure more FBO without the need to attach
and detach texture at drawing time.
- GPUFrameBuffer: The wrapper starts to mimic opengl a bit closer. This
allows to configure the framebuffer inside a context other than the one
that will be rendering the framebuffer. We do the actual configuration
when binding the FBO. We also Keep track of config validity and save
drawbuffers state in the FBO. We remove the different bind/unbind
functions. These make little sense now that we have separate contexts.
- DRWFrameBuffer: We replace DRW_framebuffer functions by GPU_framebuffer
ones to avoid another layer of abstraction. We move the DRW convenience
functions to GPUFramebuffer instead and even add new ones. The MACRO
GPU_framebuffer_ensure_config is pretty much all you need to create and
config a GPUFramebuffer.
- DRWTexture: Due to the removal of DRWFrameBuffer, we needed to create
functions to create textures for thoses framebuffers. Pool textures are
now using default texture parameters for the texture type asked.
- DRWManager: Make sure no framebuffer object is bound when doing cache
filling.
- GPUViewport: Add new color_only_fb and depth_only_fb along with FB API
usage update. This let draw engines render to color/depth only target
and without the need to attach/detach textures.
- WM_window: Assert when a framebuffer is bound when changing context.
This balance the fact we are not track ogl context inside GPUFramebuffer.
- Eevee, Clay, Mode engines: Update to new API. This comes with a lot of
code simplification.
This also come with some cleanups in some engine codes.
This is a bit useless because gpu lamps are only used by the game engine
and it is planned to be "remove" in some way.
Doing this to clean gpu_framebuffer.c.
This includes a few modification:
- The biggest one is call glActiveTexture before doing any call to
glBindTexture for rendering purpose (uniform value depends on it).
This is also better to know what's going on when rendering UI. So if
there is missing UI elements because of this commit look for this first.
This allows us to have "less calls" to glActiveTexture (I did not
measure the final count) and less checks inside GPU_texture.
- Remove use of GL_TEXTURE0 as a uniform value in a few places.
- Be more strict and use BLI_assert for bad usage of GPU_texture functions.
- Disable filtering for integer and stencil textures (not supported by
OGL specs).
- Replace bools inside GPUTexture by a bitflag supporting more options to
identify texture types.
This module has no use now with the new DrawManager and DrawEngines and it
is using deprecated paths.
Moving gpu_shader_fullscreen_vert.glsl
to draw/modes/shaders/common_fullscreen_vert.glsl
With the API recently added to gawain, it is now possible to update the vbos linked to the batch.
So the batch does not have to be destroyed.
The optimization is more sensitive when sculpt is made on low poly meshs
This is mostly to avoid re-compilation when using undo/redo operators.
This also has the benefit to reuse the same GPUShader for multiple materials using the same nodetree configuration.
The cache stores GPUPasses that already contains the shader code and a hash to test for matches.
We use refcounts to know when a GPUPass is not used anymore.
I had to move the GPUInput list from GPUPass to GPUMaterial because it's containing references to the material nodetree and cannot be reused.
A garbage collection is hardcoded to run every 60 seconds to free every unused GPUPass.
Note that setting `glDepthFunc` isn't important,
since 2.8 branch changes this value it might seem like an error
however it's harmless in this case - so better make note of this.
This separate context allows two things:
- It allows viewports in multi-windows configuration.
- F12 render can use this context in a separate thread and do a non-blocking render.
The downside is that the context cannot be used while rendering so a request to refresh a viewport will lock the UI. This is something that will be adressed in the future.
Under the hood what does that mean:
- Not adding more mess with VAOs management in gawain.
- Doing depth only draw for operators / selection needs to be done in an offscreen buffer.
- The 3D cursor "autodis" operator is still reading the backbuffer so we need to copy the result to it.
- All FBOs needed by the drawmanager must to be created/destroyed with its context active.
- We cannot use batches created for UI in the DRW context and vice-versa. There is a clear separation of resources that enables the use of safe multi-threading.