Bug Reports: Floating point precision bug handling #60

Iliya Katushenock · 2024-05-25T16:57:01+02:00

Iliya Katushenock commented

2024-05-25 16:57:01 +02:00

This PR adds short answers for floating point value precision reports to close them and a page with explanation about this.

This PR adds short answers for floating point value precision reports to close them and a page with explanation about this. ![image](/attachments/e39581af-f9c5-411b-8cc4-24270d4b9d3c) ![image](/attachments/e46392df-990f-4bbc-b323-62d3feb5c92c)

image.png

82 KiB

image.png

164 KiB

Iliya Katushenock added 1 commit 2024-05-25 16:57:03 +02:00

init 49ac1909f8

Brecht Van Lommel requested review from Philipp Oeser 2024-05-27 11:12:25 +02:00

Brecht Van Lommel requested review from Pratik Borhade 2024-05-27 11:12:25 +02:00

Philipp Oeser commented

2024-05-27 13:51:31 +02:00

Thx for this, initial thoughts:

For the question if all occurrences of such imprecisions are to be dropped as non-bugs, I would like input from the #User Interface module tbh.
-- (even if we treat such issues as non-bugs for now, it might make sense to review the way we deal with those in some cases -- even the ones provided as example reports here).
I do think we have some inconsistencies wrt. to epsilons (#109359 comes to my mind)
Python: We could provide some information handling epsilons there (e.g. sys.float_info.epsilon)?

CC @ideasman42
CC @Harley
CC @JulianEisel

If there is consensus that all occurrences of such imprecisions are to be dropped as non-bugs, then we need to work on the wording of this PR (but prefer to get consensus on the above first), examples:

"Just to do not mess UI by all this, usually not really matter, details" -- not really sure what this means?
"Fix of floating point error are not trivial" -- complexity of a possible solution is not a reason to reject a report on something we really consider an error, we should have an answer on why we dont do things (and not ask ourselves questions here)
"Most often the error is caused by the user" -- prefer other wording

Thx for this, initial thoughts: - For the question if **all** occurrences of such imprecisions are to be dropped as non-bugs, I would like input from the `#User Interface` module tbh. -- (even if we treat such issues as non-bugs for now, it might make sense to review the way we deal with those in some cases -- even the ones provided as example reports here). - I do think we have some inconsistencies wrt. to epsilons (#109359 comes to my mind) - Python: We could provide some information handling epsilons there (e.g. sys.float_info.epsilon)? CC @ideasman42 CC @Harley CC @JulianEisel If there is consensus that **all** occurrences of such imprecisions are to be dropped as non-bugs, then we need to work on the wording of this PR (but prefer to get consensus on the above first), examples: - "Just to do not mess UI by all this, usually not really matter, details" -- not really sure what this means? - "Fix of floating point error are not trivial" -- complexity of a possible solution is not a reason to reject a report on something we really consider an error, we should have an answer on why we dont do things (and not ask ourselves questions here) - "Most often the error is caused by the user" -- prefer other wording

Alaska commented

2024-05-27 14:04:48 +02:00

If there is consensus that all occurrences of such imprecisions are to be dropped as non-bugs

There have been a few floating point precision issues that have been worked on/"fixed". For example, precision issues with Musgrave texture in some situations.

So based on previous reports and how they were handled, it seems we can't close all floating point precision issues without first consulting a developer. But there are still some reports that can be closed "safely" as long as the user is informed of what's causing the issue.

> If there is consensus that **all** occurrences of such imprecisions are to be dropped as non-bugs There have been a few floating point precision issues that have been worked on/"fixed". For example, precision issues with Musgrave texture in some situations. So based on previous reports and how they were handled, it seems we can't close **all** floating point precision issues without first consulting a developer. But there are still some reports that can be closed "safely" as long as the user is informed of what's causing the issue.

Gerstmann Bradley commented

2024-05-27 16:20:41 +02:00

First-time contributor

Sorry to bother, but I wanted to suggest creating a "known issues" report to track all floating point precision problems.

This would help ensure that all relevant issues can be closed or not as duplicates.
It would provide a place for people to track these issues, in case the community is interested in optimizing these closed known limitations.
It would serve as a resource full of examples, which could help raise awareness of these issues among users, especially those without a background in computer science.

Just some potential thoughts, and I don't mind the decision either way. It's up to you guys to decide.

Sorry to bother, but I wanted to suggest creating a "known issues" report to track all floating point precision problems. 1. This would help ensure that all relevant issues can be closed or not as duplicates. 2. It would provide a place for people to track these issues, in case the community is interested in optimizing these closed known limitations. 3. It would serve as a resource full of examples, which could help raise awareness of these issues among users, especially those without a background in computer science. Just some potential thoughts, and I don't mind the decision either way. It's up to you guys to decide.

Campbell Barton reviewed 2024-05-28 03:43:26 +02:00

Campbell Barton left a comment

Documenting float precision issues seems reasonable, although I wonder how much floating-point behavior we should be documenting ourselves.

Some concerns with the PR as it is.

Some of the statements are technically incorrect although can be fixed by re-wording.

For example:
- Blender draw floating point values with fixed number of digits after dot.
  
  The fixed number is configurable and we do change it based on bug reports at times, especially for small values such as weld-by-distance.
- The statements assume 32bit floating point, users may use software that uses 64bit floats, then consider Blender bugs unique to Blender. Also, in some cases Blender does use 64 bit floats.
- Computers can handle other kinds of floating point with libraries such as https://www.mpfr.org - it's probably getting into unnecessary technical details to mention this.
Some of the grammar reads strangely, there are so many minor issues that it's probably simplest to do an editing pass instead of pointing out issues individually.

Documenting float precision issues seems reasonable, although I wonder how much floating-point behavior we should be documenting ourselves. Some concerns with the PR as it is. - Some of the statements are technically incorrect although can be fixed by re-wording. For example: - `Blender draw floating point values with fixed number of digits after dot.` The fixed number is configurable and we do change it based on bug reports at times, especially for small values such as weld-by-distance. - The statements assume 32bit floating point, users may use software that uses 64bit floats, then consider Blender bugs unique to Blender. Also, in some cases Blender does use 64 bit floats. - Computers _can_ handle other kinds of floating point with libraries such as https://www.mpfr.org - it's probably getting into unnecessary technical details to mention this. - Some of the grammar reads strangely, there are so many minor issues that it's probably simplest to do an editing pass instead of pointing out issues individually.

Julian Eisel commented

2024-05-28 12:56:41 +02:00

First of all, this goes far beyond the UI. It's a general discussion on floating point errors (as they appear in rendering, nodes, tools, ...) and how we deal with them from a product, development and issue tracking perspective. It would be good to get some decisions on this topic.

On the PR itself, I too have some concerns.

I wouldn't include an introduction to floating point errors in our developer documentation. We could link to another resource that gives a broader introduction. I'm not sure if we have any Blender specific guidance/policies that would be good to know for developers (such as when to use doubles vs floats, how many digits to show in the UI, what epsilons to use, ...), if so we should document them indeed.
Reading the "For bug triaging" section, I'm afraid it would lead to valid reports being closed as floating point issues. Not everything that has the symptoms of floating point errors is one, it needs investigation first.
Even issues that are indeed floating point errors might be easy to solve, again it needs investigation first. For example blender/blender#122181 could be solved by directly calculating the total delta instead of accumulating deltas of each increment (perhaps under certain conditions, like only when increment snapping is enabled). Was this investigated before the report got closed?
Sometimes reports indicate a bigger usability problem. For example I think blender/blender#122046 is a valid papercut, it could be addressed by showing a warning for users if driver expressions contain floating point equality checks. This could be a good task for the animation module to look into, but already got lost at the triaging stage.

I too think it would be good to have some list of known issues caused by floating point inaccuracies somewhere. All three example reports listed above seem like reasonably common enough cases to run into. Good to have them "indexed" somewhere for easy referencing.

Maybe other developers have more strict opinions than me though.

First of all, this goes far beyond the UI. It's a general discussion on floating point errors (as they appear in rendering, nodes, tools, ...) and how we deal with them from a product, development and issue tracking perspective. It would be good to get some decisions on this topic. --- On the PR itself, I too have some concerns. - I wouldn't include an introduction to floating point errors in our developer documentation. We could link to another resource that gives a broader introduction. I'm not sure if we have any Blender specific guidance/policies that would be good to know for developers (such as when to use doubles vs floats, how many digits to show in the UI, what epsilons to use, ...), if so we should document them indeed. - Reading the "For bug triaging" section, I'm afraid it would lead to valid reports being closed as floating point issues. Not everything that has the symptoms of floating point errors is one, it needs investigation first. - Even issues that are indeed floating point errors might be easy to solve, again it needs investigation first. For example blender/blender#122181 could be solved by directly calculating the total delta instead of accumulating deltas of each increment (perhaps under certain conditions, like only when increment snapping is enabled). Was this investigated before the report got closed? - Sometimes reports indicate a bigger usability problem. For example I think blender/blender#122046 is a valid papercut, it could be addressed by showing a warning for users if driver expressions contain floating point equality checks. This could be a good task for the animation module to look into, but already got lost at the triaging stage. I too think it would be good to have some list of known issues caused by floating point inaccuracies somewhere. All three example reports listed above seem like reasonably common enough cases to run into. Good to have them "indexed" somewhere for easy referencing. Maybe other developers have more strict opinions than me though.

Iliya Katushenock commented

2024-05-28 13:22:00 +02:00

"Fix of floating point error are not trivial" -- complexity of a possible solution is not a reason to reject a report on something we really consider an error, we should have an answer on why we dont do things (and not ask ourselves questions here)

And

Computers can handle other kinds of floating point with libraries such as https://www.mpfr.org - it's probably getting into unnecessary technical details to mention this.

Do not want to actually start discussion about this, my point was that this is not just single-line solution, and (i didn't read a lot library description) any this kind of solution will not work in some corner case like e + 1 i think.

"Most often the error is caused by the user" -- prefer other wording

In this one i do refer to the reports like Use Convex Hull of mesh as way to sort points. 300+ is not sorted due to floating point precision.
Instead of spent a lot of time to solve this one, probably just leg shoot case, we just added Sort Element and Points to Curve nodes with this function.

There have been a few floating point precision issues that have been worked on/"fixed". For example, precision issues with Musgrave texture in some situations.

And my main point here is that fixing of floating point errors looks like any other kind of feedback that is not suitable on bug reports tracker.
Yes, we can spend a lot of resources, decrease performance, add branches to check if value is large, just to handle this.
But at the same point we still have issues on large distant of objects, and i think in general we just do not purpose blender as software that can render 10e^10 coordinate.

This would help ensure that all relevant issues can be closed or not as duplicates.

It is possible to fix any of this issues. But do we ready to drop all performance, introduce complexity, and in the end start getting reports like 3.333... infinity string instead of instead of a decimal 3/10.

I wouldn't include an introduction to floating point errors in our developer documentation. We could link to another resource that gives a broader introduction. I'm not sure if we have any Blender specific guidance/policies that would be good to know for developers (such as when to use doubles vs floats, how many digits to show in the UI, what epsilons to use, ...), if so we should document them indeed.

I didn't want to describe this cases for developers, sometimes users can do not believe to the fact that this is just floating point error.

Reading the "For bug triaging" section, I'm afraid it would lead to valid reports being closed as floating point issues. Not everything that has the symptoms of floating point errors is one, it needs investigation first.

Initial purpose of whole this page, is to describe all this things for users, not for devs.
Probably i have to clarify this in the header of page.

Even issues that are indeed floating point errors might be easy to solve, again it needs investigation first. For example blender/blender#122181 could be solved by directly calculating the total delta instead of accumulating deltas of each increment (perhaps under certain conditions, like only when increment snapping is enabled). Was this investigated before the report got closed?

Just like i said above, this kind of things that is heuristics/have large cost/not really so bad to have (does 0.00000003 really affect something except of UI)?

Sometimes reports indicate a bigger usability problem. For example I think blender/blender#122046 is a valid papercut, it could be addressed by showing a warning for users if driver expressions contain floating point equality checks. This could be a good task for the animation module to look into, but already got lost at the triaging stage.

So, this looks like accidental feature request. If UI ready to accept this, then yes. But if think in general, this means that any ui report marked as accidental feature request have to be reviewed by UI module.

> "Fix of floating point error are not trivial" -- complexity of a possible solution is not a reason to reject a report on something we really consider an error, we should have an answer on why we dont do things (and not ask ourselves questions here) And > Computers can handle other kinds of floating point with libraries such as https://www.mpfr.org - it's probably getting into unnecessary technical details to mention this. Do not want to actually start discussion about this, my point was that this is not just single-line solution, and (i didn't read a lot library description) any this kind of solution will not work in some corner case like `e + 1` i think. > "Most often the error is caused by the user" -- prefer other wording In this one i do refer to the reports like `Use Convex Hull of mesh as way to sort points. 300+ is not sorted due to floating point precision`. Instead of spent a lot of time to solve this one, probably just leg shoot case, we just added `Sort Element` and `Points to Curve` nodes with this function. > There have been a few floating point precision issues that have been worked on/"fixed". For example, precision issues with Musgrave texture in some situations. And my main point here is that fixing of floating point errors looks like any other kind of feedback that is not suitable on bug reports tracker. Yes, we can spend a lot of resources, decrease performance, add branches to check if value is large, just to handle this. But at the same point we still have issues on large distant of objects, and i think in general we just do not purpose blender as software that can render 10e^10 coordinate. > This would help ensure that all relevant issues can be closed or not as duplicates. It is possible to fix any of this issues. But do we ready to drop all performance, introduce complexity, and in the end start getting reports like `3.333... infinity string instead of instead of a decimal 3/10`. > I wouldn't include an introduction to floating point errors in our developer documentation. We could link to another resource that gives a broader introduction. I'm not sure if we have any Blender specific guidance/policies that would be good to know for developers (such as when to use doubles vs floats, how many digits to show in the UI, what epsilons to use, ...), if so we should document them indeed. I didn't want to describe this cases for developers, sometimes users can do not believe to the fact that this is just floating point error. > Reading the "For bug triaging" section, I'm afraid it would lead to valid reports being closed as floating point issues. Not everything that has the symptoms of floating point errors is one, it needs investigation first. Initial purpose of whole this page, is to describe all this things for users, not for devs. Probably i have to clarify this in the header of page. > Even issues that are indeed floating point errors might be easy to solve, again it needs investigation first. For example blender/blender#122181 could be solved by directly calculating the total delta instead of accumulating deltas of each increment (perhaps under certain conditions, like only when increment snapping is enabled). Was this investigated before the report got closed? Just like i said above, this kind of things that is heuristics/have large cost/not really so bad to have (does 0.00000003 really affect something except of UI)? > Sometimes reports indicate a bigger usability problem. For example I think blender/blender#122046 is a valid papercut, it could be addressed by showing a warning for users if driver expressions contain floating point equality checks. This could be a good task for the animation module to look into, but already got lost at the triaging stage. So, this looks like accidental feature request. If UI ready to accept this, then yes. But if think in general, this means that any ui report marked as `accidental feature request` have to be reviewed by UI module.

Philipp Oeser requested changes 2024-05-29 09:06:23 +02:00

Philipp Oeser left a comment

I think the existing comments here already show that there is no consensus to close all precision issues as a general rule for triaging. Nothing against documenting these, but closing is not the right action in all cases.

In addition, I will reopen blender/blender#122181 & blender/blender#122046 for the respective module to decide upon.

I think the existing comments here already show that there is no consensus to close **all** precision issues as a **general rule for triaging**. Nothing against documenting these, but **closing** is not the right action in all cases. In addition, I will reopen blender/blender#122181 & blender/blender#122046 for the respective module to decide upon.

Philipp Oeser referenced this pull request from blender/blender

2024-05-29 09:08:58 +02:00

Driver expressions only seem to work with integers #122046

Philipp Oeser referenced this pull request from blender/blender

2024-05-29 09:10:06 +02:00

Rotating in 5 degree increments returns a non-integer amount #122181

Campbell Barton closed this pull request

2024-10-07 14:05:07 +02:00

Campbell Barton commented

2024-10-07 14:05:30 +02:00

This was left open, closing based on reasons given by @lichtwerk.