Fix: operators can have invalid utf-8 strings #129209

Closed
Jacques Lucke wants to merge 1 commits from JacquesLucke/blender:operator-utf8 into blender-v4.3-release

When changing the target branch, be careful to rebase the branch in your fork to match. See documentation.
Member

#129167 showed an issue where code failed because an operator name was unexpectedly not valid utf-8. This patch adds new checks to prevent this from happening when the operator is registered.

  • When the idname is not valid utf-8, the registration will fail with an error.
  • If the name, description or translation_context is invalid utf-8, it will be fixed automatically and the registration succeeds. There is a warning though. Failing operator registration in this case could break existing add-ons, so better handle the situation more gracefully.
#129167 showed an issue where code failed because an operator name was unexpectedly not valid utf-8. This patch adds new checks to prevent this from happening when the operator is registered. * When the `idname` is not valid utf-8, the registration will fail with an error. * If the `name`, `description` or `translation_context` is invalid utf-8, it will be fixed automatically and the registration succeeds. There is a warning though. Failing operator registration in this case could break existing add-ons, so better handle the situation more gracefully.
Jacques Lucke added 1 commit 2024-10-18 11:40:38 +02:00
ensure utf8
All checks were successful
buildbot/vexp-code-patch-lint Build done.
buildbot/vexp-code-patch-linux-x86_64 Build done.
buildbot/vexp-code-patch-darwin-arm64 Build done.
buildbot/vexp-code-patch-darwin-x86_64 Build done.
buildbot/vexp-code-patch-windows-amd64 Build done.
buildbot/vexp-code-patch-coordinator Build done.
8c621709de
Author
Member

@blender-bot build

@blender-bot build
Jacques Lucke requested review from Bastien Montagne 2024-10-18 11:41:35 +02:00
Bastien Montagne approved these changes 2024-10-18 11:58:58 +02:00
Bastien Montagne left a comment
Owner

LGTM, will add @ideasman42 for FYI and in case he has better ideas to solve this?

Also wonder if other type of registerable data (like Panels, UILists...) do not have the same issue currently?

LGTM, will add @ideasman42 for FYI and in case he has better ideas to solve this? Also wonder if other type of registerable data (like Panels, UILists...) do not have the same issue currently?
Bastien Montagne requested review from Campbell Barton 2024-10-18 11:59:08 +02:00
Campbell Barton requested changes 2024-10-18 12:31:03 +02:00
Campbell Barton left a comment
Owner

Is there a simple test case to validate this?

If I try to reproduce the error using the input from the report:

diff --git a/scripts/startup/bl_operators/file.py b/scripts/startup/bl_operators/file.py
index 08c2cde1146..e7e7b32f5f6 100644
--- a/scripts/startup/bl_operators/file.py
+++ b/scripts/startup/bl_operators/file.py
@@ -19,7 +19,13 @@ from bpy.app.translations import pgettext_rpt as rpt_
 
 class WM_OT_previews_batch_generate(Operator):
     """Generate selected .blend file's previews"""
-    bl_idname = "wm.previews_batch_generate"
+    bl_idname = (
+        "\xe7\x84\xb6\xe5\x90\x8e\xe5\x9c"
+        "\xa8\xe6\xa8\xa1\xe6\x80\x81\xe6"
+        "\xa8\xa1\xe5\xbc\x8f\xe4\xb8\x8b"
+        "\xe7\x9b\xb4\xe6\x8e\xa5\xe6\x9b"
+        "\xb4\xe6\x94\xb9\xe4"
+    )
     bl_label = "Batch-Generate Previews"
     bl_options = {'REGISTER'}

Blender fails to register the class with the following error:

Traceback (most recent call last):                                     
  File "/src/cmake_debug/bin/4.4/scripts/modules/bpy/utils/__init__.py", line 208, in _register_module_call                                    
    register()                                                         
    ~~~~~~~~^^                                                         
  File "/src/cmake_debug/bin/4.4/scripts/startup/bl_operators/__init__.py", line 64, in register                                               
    register_class(cls)                                                
    ~~~~~~~~~~~~~~^^^^^                                                
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc2 in position 149: invalid continuation byte  

Requesting changes because it would be good to have a reproducible test case, so it's possible to know if the same problem occurs elsewhere.

Is there a simple test case to validate this? If I try to reproduce the error using the input from the report: ```diff diff --git a/scripts/startup/bl_operators/file.py b/scripts/startup/bl_operators/file.py index 08c2cde1146..e7e7b32f5f6 100644 --- a/scripts/startup/bl_operators/file.py +++ b/scripts/startup/bl_operators/file.py @@ -19,7 +19,13 @@ from bpy.app.translations import pgettext_rpt as rpt_ class WM_OT_previews_batch_generate(Operator): """Generate selected .blend file's previews""" - bl_idname = "wm.previews_batch_generate" + bl_idname = ( + "\xe7\x84\xb6\xe5\x90\x8e\xe5\x9c" + "\xa8\xe6\xa8\xa1\xe6\x80\x81\xe6" + "\xa8\xa1\xe5\xbc\x8f\xe4\xb8\x8b" + "\xe7\x9b\xb4\xe6\x8e\xa5\xe6\x9b" + "\xb4\xe6\x94\xb9\xe4" + ) bl_label = "Batch-Generate Previews" bl_options = {'REGISTER'} ``` Blender fails to register the class with the following error: ``` Traceback (most recent call last): File "/src/cmake_debug/bin/4.4/scripts/modules/bpy/utils/__init__.py", line 208, in _register_module_call register() ~~~~~~~~^^ File "/src/cmake_debug/bin/4.4/scripts/startup/bl_operators/__init__.py", line 64, in register register_class(cls) ~~~~~~~~~~~~~~^^^^^ UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc2 in position 149: invalid continuation byte ``` Requesting changes because it would be good to have a reproducible test case, so it's possible to know if the same problem occurs elsewhere.
Contributor

Is there a simple test case to validate this?

If I try to reproduce the error using the input from the report:

diff --git a/scripts/startup/bl_operators/file.py b/scripts/startup/bl_operators/file.py
index 08c2cde1146..e7e7b32f5f6 100644
--- a/scripts/startup/bl_operators/file.py
+++ b/scripts/startup/bl_operators/file.py
@@ -19,7 +19,13 @@ from bpy.app.translations import pgettext_rpt as rpt_
 
 class WM_OT_previews_batch_generate(Operator):
     """Generate selected .blend file's previews"""
-    bl_idname = "wm.previews_batch_generate"
+    bl_idname = (
+        "\xe7\x84\xb6\xe5\x90\x8e\xe5\x9c"
+        "\xa8\xe6\xa8\xa1\xe6\x80\x81\xe6"
+        "\xa8\xa1\xe5\xbc\x8f\xe4\xb8\x8b"
+        "\xe7\x9b\xb4\xe6\x8e\xa5\xe6\x9b"
+        "\xb4\xe6\x94\xb9\xe4"
+    )
     bl_label = "Batch-Generate Previews"
     bl_options = {'REGISTER'}

Blender fails to register the class with the following error:

Traceback (most recent call last):                                     
  File "/src/cmake_debug/bin/4.4/scripts/modules/bpy/utils/__init__.py", line 208, in _register_module_call                                    
    register()                                                         
    ~~~~~~~~^^                                                         
  File "/src/cmake_debug/bin/4.4/scripts/startup/bl_operators/__init__.py", line 64, in register                                               
    register_class(cls)                                                
    ~~~~~~~~~~~~~~^^^^^                                                
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc2 in position 149: invalid continuation byte  

Requesting changes because it would be good to have a reproducible test case, so it's possible to know if the same problem occurs elsewhere.

So It is really strange,I found if I directly print the translate content in the python's print,It is OK,while in the search function ,it is an invalid string.
the operator's idname is scatter5.add_psy_modal

> Is there a simple test case to validate this? > > If I try to reproduce the error using the input from the report: > > ```diff > diff --git a/scripts/startup/bl_operators/file.py b/scripts/startup/bl_operators/file.py > index 08c2cde1146..e7e7b32f5f6 100644 > --- a/scripts/startup/bl_operators/file.py > +++ b/scripts/startup/bl_operators/file.py > @@ -19,7 +19,13 @@ from bpy.app.translations import pgettext_rpt as rpt_ > > class WM_OT_previews_batch_generate(Operator): > """Generate selected .blend file's previews""" > - bl_idname = "wm.previews_batch_generate" > + bl_idname = ( > + "\xe7\x84\xb6\xe5\x90\x8e\xe5\x9c" > + "\xa8\xe6\xa8\xa1\xe6\x80\x81\xe6" > + "\xa8\xa1\xe5\xbc\x8f\xe4\xb8\x8b" > + "\xe7\x9b\xb4\xe6\x8e\xa5\xe6\x9b" > + "\xb4\xe6\x94\xb9\xe4" > + ) > bl_label = "Batch-Generate Previews" > bl_options = {'REGISTER'} > ``` > > Blender fails to register the class with the following error: > > ``` > Traceback (most recent call last): > File "/src/cmake_debug/bin/4.4/scripts/modules/bpy/utils/__init__.py", line 208, in _register_module_call > register() > ~~~~~~~~^^ > File "/src/cmake_debug/bin/4.4/scripts/startup/bl_operators/__init__.py", line 64, in register > register_class(cls) > ~~~~~~~~~~~~~~^^^^^ > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc2 in position 149: invalid continuation byte > ``` > > Requesting changes because it would be good to have a reproducible test case, so it's possible to know if the same problem occurs elsewhere. So It is really strange,I found if I directly print the translate content in the python's print,It is OK,while in the search function ,it is an invalid string. the operator's idname is scatter5.add_psy_modal
Author
Member

This reproduces the issue for me. I don't quite get what is special about this string yet. For Python it seems to be all fine. I can encode as utf8 and decode without problem and the before/after are equal. Maybe it's also an issue in our utf8 processing?

diff --git a/scripts/startup/bl_operators/file.py b/scripts/startup/bl_operators/file.py
index 08c2cde1146..6203646d094 100644
--- a/scripts/startup/bl_operators/file.py
+++ b/scripts/startup/bl_operators/file.py
@@ -20,7 +20,7 @@ from bpy.app.translations import pgettext_rpt as rpt_
 class WM_OT_previews_batch_generate(Operator):
     """Generate selected .blend file's previews"""
     bl_idname = "wm.previews_batch_generate"
-    bl_label = "Batch-Generate Previews"
+    bl_label = "快速散佈所選物件, 然後在模態模式下直接更改一些設定。"
     bl_options = {'REGISTER'}
 
     # -----------

This reproduces the issue for me. I don't quite get what is special about this string yet. For Python it seems to be all fine. I can encode as utf8 and decode without problem and the before/after are equal. Maybe it's also an issue in our utf8 processing? ```diff diff --git a/scripts/startup/bl_operators/file.py b/scripts/startup/bl_operators/file.py index 08c2cde1146..6203646d094 100644 --- a/scripts/startup/bl_operators/file.py +++ b/scripts/startup/bl_operators/file.py @@ -20,7 +20,7 @@ from bpy.app.translations import pgettext_rpt as rpt_ class WM_OT_previews_batch_generate(Operator): """Generate selected .blend file's previews""" bl_idname = "wm.previews_batch_generate" - bl_label = "Batch-Generate Previews" + bl_label = "快速散佈所選物件, 然後在模態模式下直接更改一些設定。" bl_options = {'REGISTER'} # ----------- ```
Contributor

This reproduces the issue for me. I don't quite get what is special about this string yet. For Python it seems to be all fine. I can encode as utf8 and decode without problem and the before/after are equal. Maybe it's also an issue in our utf8 processing?

diff --git a/scripts/startup/bl_operators/file.py b/scripts/startup/bl_operators/file.py
index 08c2cde1146..6203646d094 100644
--- a/scripts/startup/bl_operators/file.py
+++ b/scripts/startup/bl_operators/file.py
@@ -20,7 +20,7 @@ from bpy.app.translations import pgettext_rpt as rpt_
 class WM_OT_previews_batch_generate(Operator):
     """Generate selected .blend file's previews"""
     bl_idname = "wm.previews_batch_generate"
-    bl_label = "Batch-Generate Previews"
+    bl_label = "快速散佈所選物件, 然後在模態模式下直接更改一些設定。"
     bl_options = {'REGISTER'}
 
     # -----------

Yes,you are right.I think the string in blender has a max length,and the hard cut logic cut a utf8 string with an end of a incomplete character,so the window_end is larger then full_end,because window_end is calculated by the first 8 bit of the character.I am not sure if it is the problem.

> This reproduces the issue for me. I don't quite get what is special about this string yet. For Python it seems to be all fine. I can encode as utf8 and decode without problem and the before/after are equal. Maybe it's also an issue in our utf8 processing? > > ```diff > diff --git a/scripts/startup/bl_operators/file.py b/scripts/startup/bl_operators/file.py > index 08c2cde1146..6203646d094 100644 > --- a/scripts/startup/bl_operators/file.py > +++ b/scripts/startup/bl_operators/file.py > @@ -20,7 +20,7 @@ from bpy.app.translations import pgettext_rpt as rpt_ > class WM_OT_previews_batch_generate(Operator): > """Generate selected .blend file's previews""" > bl_idname = "wm.previews_batch_generate" > - bl_label = "Batch-Generate Previews" > + bl_label = "快速散佈所選物件, 然後在模態模式下直接更改一些設定。" > bl_options = {'REGISTER'} > > # ----------- > > ``` Yes,you are right.I think the string in blender has a max length,and the hard cut logic cut a utf8 string with an end of a incomplete character,so the window_end is larger then full_end,because window_end is calculated by the first 8 bit of the character.I am not sure if it is the problem.

Committed alternate fix 1d6add574d, closing.

Committed alternate fix 1d6add574d7547cc76217040b49075bad8c7f9e7, closing.
Campbell Barton closed this pull request 2024-10-25 05:40:54 +02:00
All checks were successful
buildbot/vexp-code-patch-lint Build done.
buildbot/vexp-code-patch-linux-x86_64 Build done.
buildbot/vexp-code-patch-darwin-arm64 Build done.
buildbot/vexp-code-patch-darwin-x86_64 Build done.
buildbot/vexp-code-patch-windows-amd64 Build done.
buildbot/vexp-code-patch-coordinator Build done.

Pull request closed

Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Code Documentation
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#129209
No description provided.