Fix #114260: Compositor sometimes produces straight alpha #114305

Merged
Omar Emara merged 6 commits from OmarEmaraDev/blender:fix-114260 into main 2023-11-06 15:15:33 +01:00
Member

The compositor sometimes produces straigh alpha even though
premultiplied alpha is expected. Moreover, there is an inconsitiency
between the CPU and GPU compositors.

For the GPU compositor, this is because GPU textures sometimes store
straight alpha, while the compositor always expects premultiplied alpha,
so we need to premultiply the alpha in those cases.

For the CPU compositor, this is because the image operation didn't
premultiply the alpha of byte textures, so we need to ensure
premultiplied alpha in those cases.

There is a data loss issue in case of byte images, since the IMB module
unpremultiplies premultiplied images then the compositior premultiplies
it again. But this will be handled in a different patch since it require
some design and refactoring first.

The compositor sometimes produces straigh alpha even though premultiplied alpha is expected. Moreover, there is an inconsitiency between the CPU and GPU compositors. For the GPU compositor, this is because GPU textures sometimes store straight alpha, while the compositor always expects premultiplied alpha, so we need to premultiply the alpha in those cases. For the CPU compositor, this is because the image operation didn't premultiply the alpha of byte textures, so we need to ensure premultiplied alpha in those cases. There is a data loss issue in case of byte images, since the IMB module unpremultiplies premultiplied images then the compositior premultiplies it again. But this will be handled in a different patch since it require some design and refactoring first.
Omar Emara added 1 commit 2023-10-31 09:29:40 +01:00
0bae72783d Fix #114260: Alpha handling is different in GPU compositor
Alpha in loaded differently in the GPU compositor relative to CPU
compositors.

This is because GPU textures sometimes store straigh alpha, while the
compositor always expects premultiplied alpha, so we need to premultiply
the alpha in those cases.
Author
Member

@brecht @Sergey

If we have an 8-bit PNG image with emissive transparent color [1, 1, 1, 0] and an alpha mode of premultiplied. One would expect the final output to be [1, 1, 1, 0], but is in fact [0, 0, 0, 0] at the moment.

That's because the alpha handling will be as follows:

  1. The PNG reader will load the color directly as [1, 1, 1, 0].
  2. imb_handle_alpha will unpremultiply it to be [1, 1, 1, 0].
  3. IMB_colormanagement_imbuf_to_float_texture in imb_gpu_get_data will premultiply the alpha to get [0, 0, 0, 0].

So reading such images involves some lossy operations. Any pointers on how we should handle that?

@brecht @Sergey If we have an 8-bit PNG image with emissive transparent color `[1, 1, 1, 0]` and an alpha mode of premultiplied. One would expect the final output to be `[1, 1, 1, 0]`, but is in fact `[0, 0, 0, 0]` at the moment. That's because the alpha handling will be as follows: 1. The PNG reader will load the color directly as `[1, 1, 1, 0]`. 2. `imb_handle_alpha` will unpremultiply it to be `[1, 1, 1, 0]`. 3. `IMB_colormanagement_imbuf_to_float_texture` in `imb_gpu_get_data` will premultiply the alpha to get `[0, 0, 0, 0]`. So reading such images involves some lossy operations. Any pointers on how we should handle that?

Conversion between straight and premultiplied alpha is lossy in both directions. Both have values that the other can't represent. The compositor works with premultiplied alpha, which is the natural format for renders and EXR files. Following the current design, the lossy operation is expected.

Changing the Alpha setting on the image can be used to load the straight alpha unmodified, but it is not very clear and compositing nodes by default will not work correctly with such images.

With that in mind, I'm not sure I understand the question. Are you asking if we should improve support for loading straight alpha without data loss in the compositor, both for the CPU and GPU?

Conversion between straight and premultiplied alpha is lossy in both directions. Both have values that the other can't represent. The compositor works with premultiplied alpha, which is the natural format for renders and EXR files. Following the current design, the lossy operation is expected. Changing the Alpha setting on the image can be used to load the straight alpha unmodified, but it is not very clear and compositing nodes by default will not work correctly with such images. With that in mind, I'm not sure I understand the question. Are you asking if we should improve support for loading straight alpha without data loss in the compositor, both for the CPU and GPU?
Author
Member

With that in mind, I'm not sure I understand the question. Are you asking if we should improve support for loading straight alpha without data loss in the compositor, both for the CPU and GPU?

Yes, but the problem I described above is about premultiplied alpha, not straight one. I realize that the conversion is an intrinsically lossy operation, my point is that the conversion is redundant, and thus the loss can be avoided.

In my example, the PNG image already contains premultiplied alpha, which the user indicated by choosing the premultiplied alpha mode. And since the compositor needs premultiplied alpha anyways, no conversion is needed here. Yet, we do two opposite operations to get the same data, except with data loss.

> With that in mind, I'm not sure I understand the question. Are you asking if we should improve support for loading straight alpha without data loss in the compositor, both for the CPU and GPU? Yes, but the problem I described above is about premultiplied alpha, not straight one. I realize that the conversion is an intrinsically lossy operation, my point is that the conversion is redundant, and thus the loss can be avoided. In my example, the PNG image already contains premultiplied alpha, which the user indicated by choosing the premultiplied alpha mode. And since the compositor needs premultiplied alpha anyways, no conversion is needed here. Yet, we do two opposite operations to get the same data, except with data loss.

Ah yes. The issue is that by all byte buffers in ImBuf are straight alpha. Ideally we would support both premul and straight alpha images in these byte buffers, but that requires many changes throughout the code.

I can't immediately think of a good way to bypass that byte buffer.

Ah yes. The issue is that by all byte buffers in `ImBuf` are straight alpha. Ideally we would support both premul and straight alpha images in these byte buffers, but that requires many changes throughout the code. I can't immediately think of a good way to bypass that byte buffer.

To me it seems to be two separate topics. One of them being aligning the GPU compositor to the CPU within the current limitations of the 8bit image storage in Blender. And this is what the path is about, unless I am missing something? It marked as WIP, so not sure if it is ready for code/functional review?

The other one is the 8bit and alpha handling in Blender. Which is a very valid topic, but needs more clear design and will be a much bigger change in the code side.

To me it seems to be two separate topics. One of them being aligning the GPU compositor to the CPU within the current limitations of the 8bit image storage in Blender. And this is what the path is about, unless I am missing something? It marked as WIP, so not sure if it is ready for code/functional review? The other one is the 8bit and alpha handling in Blender. Which is a very valid topic, but needs more clear design and will be a much bigger change in the code side.
Author
Member

@Sergey The two topics are related in that the GPU texture generation from images causes the aforementioned data loss, while the same data loss doesn't happen for the CPU compositor, considering it doesn't use that GPU code path. I still have this as a WIP because I am still investigating why the CPU compositor produces results with no data loss, and I suspect that's because its output is wrong, and a fix would produce the same data loss. More on this when I finish investigating.

@Sergey The two topics are related in that the GPU texture generation from images causes the aforementioned data loss, while the same data loss doesn't happen for the CPU compositor, considering it doesn't use that GPU code path. I still have this as a WIP because I am still investigating why the CPU compositor produces results with no data loss, and I suspect that's because its output is wrong, and a fix would produce the same data loss. More on this when I finish investigating.

@OmarEmaraDev COM_ImageOperation.cc, sample_image_at_location. It seems the the CPU compositor outputs image as-is, without associating the alpha channel sigh :'(

@OmarEmaraDev `COM_ImageOperation.cc`, `sample_image_at_location`. It seems the the CPU compositor outputs image as-is, without associating the alpha channel *sigh* :'(
Author
Member

I guess that confirms my suspension. :)
So I will probably have to also update the CPU compositor to do the same thing as this patch.
Then we can handle the data loss separately.

I guess that confirms my suspension. :) So I will probably have to also update the CPU compositor to do the same thing as this patch. Then we can handle the data loss separately.
Omar Emara added 2 commits 2023-11-01 14:38:53 +01:00
Omar Emara changed title from WIP: Fix #114260: Alpha handling is different in GPU compositor to Fix #114260: Compositor sometimes produces straight alpha 2023-11-01 14:40:04 +01:00
Omar Emara requested review from Sergey Sharybin 2023-11-01 14:57:15 +01:00
Omar Emara added the
Module
VFX & Video
Interest
Compositing
labels 2023-11-01 14:57:26 +01:00
Brecht Van Lommel requested changes 2023-11-01 17:46:08 +01:00
Brecht Van Lommel left a comment
Owner

One change to make is to support the Channel Packed alpha case. For this case there is not meant to be any premul/straight conversion, it should always be skipped. Channel Packed means the alpha channel might not even contain alpha.

A benefit of supporting this is that it will allow users to get the channels unmodified, if they want to preserve the straight alpha image unmodified as they could before this change.


I also looked into why it was would store a float buffer as straight alpha on the GPU, since this was surprising to me.

It goes back to fb03f50e06, where I changed the image texture to store alpha the same as it is in the image file. This gave better texture filtering compatibility between EEVEE and game engines.

We could change straight alpha float images to be stored as premultiplied, which would be another way of solving #114260. However I think the code that was added in this PR to check how the texture was stored and converting as appropriate is needed anyway for byte buffers, and making it able to handle both cases seems good. So I would not make further changes for that.

One change to make is to support the Channel Packed alpha case. For this case there is not meant to be any premul/straight conversion, it should always be skipped. Channel Packed means the alpha channel might not even contain alpha. A benefit of supporting this is that it will allow users to get the channels unmodified, if they want to preserve the straight alpha image unmodified as they could before this change. --- I also looked into why it was would store a float buffer as straight alpha on the GPU, since this was surprising to me. It goes back to fb03f50e069d66c99391e4796e1b9eaa2b4cc133, where I changed the image texture to store alpha the same as it is in the image file. This gave better texture filtering compatibility between EEVEE and game engines. We could change straight alpha float images to be stored as premultiplied, which would be another way of solving #114260. However I think the code that was added in this PR to check how the texture was stored and converting as appropriate is needed anyway for byte buffers, and making it able to handle both cases seems good. So I would not make further changes for that.
Omar Emara added 2 commits 2023-11-02 13:46:44 +01:00
Sergey Sharybin approved these changes 2023-11-02 15:24:27 +01:00
Sergey Sharybin left a comment
Owner

On a code side it seems to do the proper things now. I've also verified the behavior of the report and it is indeed fixed.

The unfortunate aspect is that this is potentially quiet intrusive change. For example, even our regression tests renders differently with this change and need to be updated.

I don't think it is a stopper for us moving forward with the changes and fixes, but something that we need to clearly communicate. Ideally, also avoid breaking things too often and break this and handle tiled removal in the same release :)

On a code side it seems to do the proper things now. I've also verified the behavior of the report and it is indeed fixed. The unfortunate aspect is that this is potentially quiet intrusive change. For example, even our regression tests renders differently with this change and need to be updated. I don't think it is a stopper for us moving forward with the changes and fixes, but something that we need to clearly communicate. Ideally, also avoid breaking things too often and break this and handle tiled removal in the same release :)
Brecht Van Lommel approved these changes 2023-11-02 15:34:48 +01:00
Omar Emara added 1 commit 2023-11-02 16:19:20 +01:00
Author
Member

@brecht @Sergey There are still inconsistencies between CPU and GPU compositors for sRGB 8-bit premultiplied alpha mode images.

As far as I can tell, the difference is due to the order of premultiply and linear space conversion:

  • For GPU, such type of images is stored in GPU_SRGB8_A8 textures, so premultiplication happen in the host, then linear space conversion happen in the shader.
  • For CPU, linear space conversion happen first, then premultiplication.

So the order is reversed for those cases. 😞


We can't opt to do it the opposite way in the CPU compositor, because the GPU compositor only does this for this specific type of images, while other images follow the linear conversion first order. Unless we opt to do some rather complicated conditionals everytime we do premultiplication and linear space conversion.

@brecht @Sergey There are still inconsistencies between CPU and GPU compositors for sRGB 8-bit premultiplied alpha mode images. As far as I can tell, the difference is due to the order of premultiply and linear space conversion: - For GPU, such type of images is stored in `GPU_SRGB8_A8` textures, so premultiplication happen in the host, then linear space conversion happen in the shader. - For CPU, linear space conversion happen first, then premultiplication. So the order is reversed for those cases. :disappointed: --- We can't opt to do it the opposite way in the CPU compositor, because the GPU compositor only does this for this specific type of images, while other images follow the linear conversion first order. Unless we opt to do some rather complicated conditionals everytime we do premultiplication and linear space conversion.

Ah yes. I remember there being another difference with sRGB texture interpolation, where NVIDIA and AMD do it differently. One does color space conversion before interpolation, and the other after. Though maybe the compositor only reads single pixels without interpolation and is not affected by this.

Another issue is limit texture size in the preferences, which probably should not affect the compositor.

I was working on refactoring image GPU storage, and as part of that planned to make it possible to have distinct GPU textures for 3D rendering and for full precision. The latter would be used for example for image editor display, and could be used for compositing too. Maybe that's the proper solution here? I'm not sure yet when I will have time to complete that.
https://projects.blender.org/brecht/blender/src/branch/unify-tex/source/blender/imbuf/IMB_imbuf_types.h#L203

Ah yes. I remember there being another difference with sRGB texture interpolation, where NVIDIA and AMD do it differently. One does color space conversion before interpolation, and the other after. Though maybe the compositor only reads single pixels without interpolation and is not affected by this. Another issue is limit texture size in the preferences, which probably should not affect the compositor. I was working on refactoring image GPU storage, and as part of that planned to make it possible to have distinct GPU textures for 3D rendering and for full precision. The latter would be used for example for image editor display, and could be used for compositing too. Maybe that's the proper solution here? I'm not sure yet when I will have time to complete that. https://projects.blender.org/brecht/blender/src/branch/unify-tex/source/blender/imbuf/IMB_imbuf_types.h#L203
Author
Member

@brecht I am actually not talking about the order of space conversion and interpolation here. The problem is in the order of "Alpha Premultiplication" and "Linear Space Conversion". In all cases on CPU and GPU, linear space conversion happen first, then alpha premultiplication, as you suggested.
Except when the image is an 8-bit sRGB premultiplied image, in that case, we use GPU_SRGB8_A8 textures and rely on the GPU drivers to do the sRGB to linear conversion when doing the texel fetch in shaders, so we only do the alpha premultiplication ourselves in the host code, so effectively, the order is reversed and alpha premultiplication now happen first.

This change of order can produce significant differences, for instance, using the same file from the report, this is the difference between GPU and CPU:

20231102-203823.png

20231102-203729.png

@brecht I am actually not talking about the order of space conversion and interpolation here. The problem is in the order of "**Alpha Premultiplication**" and "**Linear Space Conversion**". In all cases on CPU and GPU, linear space conversion happen first, then alpha premultiplication, as you suggested. Except when the image is an 8-bit sRGB premultiplied image, in that case, we use `GPU_SRGB8_A8` textures and rely on the GPU drivers to do the sRGB to linear conversion when doing the texel fetch in shaders, so we only do the alpha premultiplication ourselves in the host code, so effectively, the order is reversed and alpha premultiplication now happen first. This change of order can produce significant differences, for instance, using the same file from the report, this is the difference between GPU and CPU: ![20231102-203823.png](/attachments/4ade04a5-603b-4145-879e-1449bebd964f) ![20231102-203729.png](/attachments/7b3f6597-e052-4c9a-bf98-2ec95437ed70)
Author
Member

Apparently, this difference issue also materializes outside of the compositor. Because EEVEE and Cycles both treat 8-bit sRGB premultiplied images similarly. While the image editor treats the images like the CPU compositor.

So the CPU compositor, image editor, and actual raw data represent the image one way; while the GPU compositor, EEVEE, and Cycles represent the image another way. As demonstrated by the screenshot with shoes Image Editor vs EEVEE:

20231103-134329.png

Apparently, this difference issue also materializes outside of the compositor. Because EEVEE and Cycles both treat 8-bit sRGB premultiplied images similarly. While the image editor treats the images like the CPU compositor. So the CPU compositor, image editor, and actual raw data represent the image one way; while the GPU compositor, EEVEE, and Cycles represent the image another way. As demonstrated by the screenshot with shoes Image Editor vs EEVEE: ![20231103-134329.png](/attachments/a6027d39-b166-438a-9213-a9385524649f)

There are definitely inconsistencies in the way how images are displayed (viewport, image editor, backdrop, etc). I think to keep things simple and achievable and reviewable we can focus on making it so that saving file from CPU and GPU compositor will lead to the similar result (kinda ignore the fact that different areas of Blender might display image differently, and look into the drawing aspects separately from compositor).

There are definitely inconsistencies in the way how images are displayed (viewport, image editor, backdrop, etc). I think to keep things simple and achievable and reviewable we can focus on making it so that saving file from CPU and GPU compositor will lead to the similar result (kinda ignore the fact that different areas of Blender might display image differently, and look into the drawing aspects separately from compositor).
Author
Member

This is actually a different in the actual data and saved files, not the display. I just noted the image editor for perspective on how other areas of Blender do this same thing. But I understand you don't want this pull request to drag on, so I will merge it and we can handle the issue later.

This is actually a different in the actual data and saved files, not the display. I just noted the image editor for perspective on how other areas of Blender do this same thing. But I understand you don't want this pull request to drag on, so I will merge it and we can handle the issue later.
Omar Emara merged commit 5258b17ef6 into main 2023-11-06 15:15:33 +01:00
Omar Emara deleted branch fix-114260 2023-11-06 15:15:34 +01:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#114305
No description provided.