UI: Text Object International Case Change #106581
No reviewers
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Code Documentation
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Asset System
Module
Asset System
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#106581
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "Harley/blender:TextObjectCase"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Allow Text Object operator FONT_OT_case_set to correctly transform the case
of strings written in almost all scripts that differentiate letter case.
While editing a text object there is a "Text" menu that contains "To Uppercase" and "To Lowercase" that only operate on the lower ascii characters. This PR makes them work with almost all bicameral scripts. This includes all languages using Latin, Cyrillic, Greek, Coptic, Armenian, and other alphabets. Obviously this assumes that the loaded font properly supports the language.
There are some caveats though. These are only one-to-one mappings so can't always be correct for uppercase Σ since it has two lowercase forms depending on word position. Similarly lowercase ß won't become "SS".
I'd rather avoid setlocale, both because temporarily setting it could back-fire and because the exact result depends on the operating system.
Attached a patch which uses a bad-level call into Python, which isn't great - but on balance I'd prefer to use this for consistency.
NOTE: resolving bad level calls is a fairly simple project which is worth looking into - but that can be done separately from this patch.
Requested changed in reply.
2241b7c81a
to4209c4990d
Makes sense.
Looking into this further, the upper/lower mappings do not actually differ depending on locale. It is just that towupper and towlower don't work correctly unless you set a "*.utf8" locale.
So instead created
BLI_utf32_upper
andBLI_utf32_lower
using data from these pagesThe above is just data. The ICU itself is also GPL-Compatible (X license).
This PR highlights the need for a unicode library in Blender, so far we've been getting buy without one but it's limiting.
One of the more important uses of unicode-case conversion is case-insensitive-search which would be nice to support.
My concern with this PR is that it's adding unicode conversions that are slow (using a binary search, unlike Python's which indexes into arrays), and inlines unicode data which isn't updated based on changes to the unicode spec.
Performance wont be an issue with this operator, but using these for case-insensitive search could be an issue.
Python's unicode utilities are very close to what need as it supports case-conversion, categories such as alpha/decimal/digit/space/printable ... information so we could extract this into our own library, along with it's script to automate updates from the unicode consortium.
The whole-string conversion also supports lowercase
ß
properly, noted as a TODO in this PR.Personally I think my patch is OK, although not ideal - as it's just postponing us using a more general unicode library, listing some possible alternatives.
intern/
library.BLI_utf32_upper
/BLI_utf32_lower
compared to Python's functions, a binary search may have acceptable performance, even for interactive search.4209c4990d
to5c2c4b6735
I didn't anticipate any uses that would require fast performance, so have updated this patch to maximize this.
It now directly calculates upper/lower offset for the ranges where this can be done (lower Latin, parts of extended Latin, Armenian, Georgian, Enclosed letterforms, and Fullwidth letterforms.
It now also only does the binary search of the character arrays if we are in three specific ranges where direct calculation is not possible. It early exits as much as possible.
5c2c4b6735
to6f7aeb4070
@ -174,0 +177,4 @@
* mappings so this doesn't work corectly for uppercase Σ (two lowercase forms) and lowercase ß
* won't become "SS".
*/
char32_t BLI_utf32_upper(char32_t wc);
Prefer
BLI_str_utf32_char_to_upper
/BLI_str_utf32_char_to_lower
- which is in keeping withBLI_str_*
API.We could also add a
BLI_string_utf32.h
, or consider renamingBLI_string_utf8.h
toBLI_string_unicode.h
since it doesn't make sense to add utf32 functions to autf8
named header.Suggest to do this as part of a separate commit though.
@ -402,0 +420,4 @@
if (wc <= U'\x24E9' && wc >= U'\x24D0') { /* Enclosed ⓐ - ⓩ */
return wc - 26;
}
if (wc <= U'\xFF5A' && wc >= U'\xFF41') { /* Fullwidth a - z */
Avoid unicode in comments
a
andz
are fine here.@ -402,0 +417,4 @@
/* Armenian & Georgian */
return wc - 48;
}
if (wc <= U'\x24E9' && wc >= U'\x24D0') { /* Enclosed ⓐ - ⓩ */
Avoid unicode comments (a) and (z) are fine here. (or use their ID as well if that helps).
6f7aeb4070
to8456d396bc
@blender-bot build