This is the unification of all overlays into one overlay engine as described in T65347.
I went over all the code making it more future proof with less hacks and removing old / not relevent parts.
Goals / Acheivements:
- Remove internal shader usage (only drw shaders)
- Remove viewportSize and viewportSizeInv and put them in gloabl ubo
- Fixed some drawing issues: Missing probe option and Missing Alt+B clipping of some shader
- Remove old (legacy) shaders dependancy (not using view UBO).
- Less shader variation (less compilation time at first load and less patching needed for vulkan)
- removed some geom shaders when I could
- Remove static e_data (except shaders storage where it is OK)
- Clear the way to fix some anoying limitations (dithered transparency, background image compositing etc...)
- Wireframe drawing now uses the same batching capabilities as workbench & eevee (indirect drawing).
- Reduced complexity, removed ~3000 Lines of code in draw (also removed a lot of unused shader in GPU).
- Post AA to avoid complexity and cost of MSAA.
Remaining issues:
- ~~Armature edits, overlay toggles, (... others?) are not refreshing viewport after AA is complete~~
- FXAA is not the best for wires, maybe investigate SMAA
- Maybe do something more temporally stable for AA.
- ~~Paint overlays are not working with AA.~~
- ~~infront objects are difficult to select.~~
- ~~the infront wires sometimes goes through they solid counterpart (missing clear maybe?) (toggle overlays on-off when using infront+wireframe overlay in solid shading)~~
Note: I made some decision to change slightly the appearance of some objects to simplify their drawing. Namely the empty arrows end (which is now hollow/wire) and distance points of the cameras/spots being done by lines.
Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D6296
BIF_gl.h included hacks like redefining glew functions and a constant.
The named constant `GLA_PIXEL_OFS` has been moved to `GPU_viewport.h`
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D5860
For offscreen rendering a high definition color buffer is needed.
Without it there are banding issues when doing multi-sampling viewport
rendering.
Reviewed By: fclem
Maniphest Tasks: T65287
Differential Revision: https://developer.blender.org/D5009
This will have multiple benefit.
TODO detail benefits (culling, more explicit, handling of clipping planes)
For now the view usage is wrapped to make changes needed more progressive.
If image buffer is not loaded and blender attempts to reload it (during
`BKE_image_acquire_ibuf`) over and over for each frame rendered.
When attempting this reload, image_load_image_file is calling
`BKE_image_free_buffers` and tag the Image to the (GPU) image_free_queue
(because this run on the rendering thread).
If the main thread decide to redraw the UI and go through `GPU_free_unused_buffers` they all get deleted and if that happens before the rendering thread use them ... segfault.
If I replace the environment textures with correct ones (the file does not seems to contain them), there is no crash when rendering.
I used a list of GPUTexture from blender Image to increase and decrease the
reference counter correctly.
This add very little memory and computation overhead.
BF-admins agree to remove header information that isn't useful,
to reduce noise.
- BEGIN/END license blocks
Developers should add non license comments as separate comment blocks.
No need for separator text.
- Contributors
This is often invalid, outdated or misleading
especially when splitting files.
It's more useful to git-blame to find out who has developed the code.
See P901 for script to perform these edits.
This is in order to make the API more multithread friendly inside the
draw manager.
GPU_shader_get_uniform will only serve to query the shader interface and
not do any GL call, making it threadsafe.
For now it only print a warning if the uniform was not queried before.
Because:
- Less redundancy.
- Better suffixes.
Also a few modification to GPU_texture_create_* to simplify the API:
- make the format explicit to the texture creation process.
- remove the component count as it's specified in the GPUTextureFormat.
Not really happy with the fix, but it works. With the new window draw method
we are no longer storing the 3D viewport in 4 buffers, by having the GPU
viewport function directly as the 3rd buffer. This means we need to draw the
action zones into it, and so we need to keep the framebuffer bound a little
longer.
For Blender 2.8 we had to be compatible with very old OpenGL versions, and
triple buffer was designed to work without offscreen rendering, by copying
the the backbuffer to a texture right before swapping. This way we could
avoid redrawing unchanged regions by copying them from this texture on the
next redraws. Triple buffer used to suffer from poor performance and driver
bugs on specific cards, so alternative draw methods remained available.
Now that we require newer OpenGL, we can have just a single draw method
that draw each region into an offscreen buffer, and then draws those to
the screen. This has some advantages:
* Poor 3D view performance when using Region Overlap should be solved now,
since we can also cache overlapping regions in offscreen buffers.
* Page flip, anaglyph and interlace stereo drawing can be a little faster
by avoiding a copy to an intermediate texture.
* The new 3D view drawing already writes to an offscreen buffer, which we
can draw from directly instead of duplicating it to another buffer.
* Eventually we will be able to remove depth and stencil buffers from the
window and save memory, though at the moment there are still some tools
using it so it's not possible yet.
* This also fixes a bug with Eevee sampling not progressing with stereo
drawing in the 3D viewport.
Differential Revision: https://developer.blender.org/D3061
This refactor modernise the use of framebuffers.
It also touches a lot of files so breaking down changes we have:
- GPUTexture: Allow textures to be attached to more than one GPUFrameBuffer.
This allows to create and configure more FBO without the need to attach
and detach texture at drawing time.
- GPUFrameBuffer: The wrapper starts to mimic opengl a bit closer. This
allows to configure the framebuffer inside a context other than the one
that will be rendering the framebuffer. We do the actual configuration
when binding the FBO. We also Keep track of config validity and save
drawbuffers state in the FBO. We remove the different bind/unbind
functions. These make little sense now that we have separate contexts.
- DRWFrameBuffer: We replace DRW_framebuffer functions by GPU_framebuffer
ones to avoid another layer of abstraction. We move the DRW convenience
functions to GPUFramebuffer instead and even add new ones. The MACRO
GPU_framebuffer_ensure_config is pretty much all you need to create and
config a GPUFramebuffer.
- DRWTexture: Due to the removal of DRWFrameBuffer, we needed to create
functions to create textures for thoses framebuffers. Pool textures are
now using default texture parameters for the texture type asked.
- DRWManager: Make sure no framebuffer object is bound when doing cache
filling.
- GPUViewport: Add new color_only_fb and depth_only_fb along with FB API
usage update. This let draw engines render to color/depth only target
and without the need to attach/detach textures.
- WM_window: Assert when a framebuffer is bound when changing context.
This balance the fact we are not track ogl context inside GPUFramebuffer.
- Eevee, Clay, Mode engines: Update to new API. This comes with a lot of
code simplification.
This also come with some cleanups in some engine codes.
This includes a few modification:
- The biggest one is call glActiveTexture before doing any call to
glBindTexture for rendering purpose (uniform value depends on it).
This is also better to know what's going on when rendering UI. So if
there is missing UI elements because of this commit look for this first.
This allows us to have "less calls" to glActiveTexture (I did not
measure the final count) and less checks inside GPU_texture.
- Remove use of GL_TEXTURE0 as a uniform value in a few places.
- Be more strict and use BLI_assert for bad usage of GPU_texture functions.
- Disable filtering for integer and stencil textures (not supported by
OGL specs).
- Replace bools inside GPUTexture by a bitflag supporting more options to
identify texture types.
This separate context allows two things:
- It allows viewports in multi-windows configuration.
- F12 render can use this context in a separate thread and do a non-blocking render.
The downside is that the context cannot be used while rendering so a request to refresh a viewport will lock the UI. This is something that will be adressed in the future.
Under the hood what does that mean:
- Not adding more mess with VAOs management in gawain.
- Doing depth only draw for operators / selection needs to be done in an offscreen buffer.
- The 3D cursor "autodis" operator is still reading the backbuffer so we need to copy the result to it.
- All FBOs needed by the drawmanager must to be created/destroyed with its context active.
- We cannot use batches created for UI in the DRW context and vice-versa. There is a clear separation of resources that enables the use of safe multi-threading.
This is a special memory manager that keeps memory blocks ready to send as vbo data.
Since we loose which memory block was used each DRWShadingGroup we need to redistribute them in the same order/size to avoid to realloc each frame.
This is why DRWInstanceDatas are sorted in a list for each different data size.
This gets rid of the bottleneck of allocation / free of thousands of elements every frame.
Cache time (Eevee) (test scene is default file with cube duplicated 3241 times)
pre-patch: 23ms
post-patch: 14ms
You can change the amount of samples in the user preferences. You do not need to restart blender to see the effect in the new viewport.
This adds another Multisample Framebuffer and textures (so even more memory required).
It works by blitting the default_fb to the multisample_fb each time the renderer need to render one or more "wire" pass.
It it then blit back to the default_fb so that the rest of pipeline is working as expected.
We COULD lower the GPU memory / bandwidth usage to render everything to the same multisample fbo and change the logic depending on if MSAA is enabled or not, but I think it's a bit too much work for now.