I18n: translate new text object body #115370

Merged
Bastien Montagne merged 4 commits from pioverfour/blender:dp_translate_new_text_body into main 2023-11-27 20:44:59 +01:00
Member

After VFont has been updated to use the BLF API in 604ee2d036, it can
use the fallback font stack and display text in many scripts.

This change means the default text for font objects can now be
translated, instead of always being the English word 'Text'.
Translation will only occur if the user has enabled translation of new
data in the preferences.


I’m not sure the memory allocations are correct. The previous code allocated and copied arrays of length 12 for a text of length 4 ("Text"). In contrast, the code in rna_curve.cc allocates strings of size len_bytes + sizeof(char32_t), while editfont.cc directly uses len_bytes + 4. I mostly copied the implementation from rna_curve.cc.

The change can be tested by creating texts for all available languages using the following script:

import bpy
from bl_i18n_utils import settings

offset = 0
for _, lang_name, lang_uid in settings.LANGUAGES:
    try:
        bpy.context.preferences.view.language = lang_uid
        bpy.ops.object.text_add(enter_editmode=False, align='WORLD',
                                location=(0, -offset, 0))
        bpy.context.active_object.name = f"{offset:03}.Text.{lang_name}"
        offset += 1
    except TypeError:
        pass

image

After VFont has been updated to use the BLF API in 604ee2d036, it can use the fallback font stack and display text in many scripts. This change means the default text for font objects can now be translated, instead of always being the English word 'Text'. Translation will only occur if the user has enabled translation of new data in the preferences. ----- I’m not sure the memory allocations are correct. The previous code allocated and copied arrays of length 12 for a text of length 4 ("Text"). In contrast, the code in [rna_curve.cc](https://projects.blender.org/blender/blender/src/branch/main/source/blender/makesrna/intern/rna_curve.cc#L595-L599) allocates strings of size `len_bytes + sizeof(char32_t)`, while [editfont.cc](https://projects.blender.org/blender/blender/src/branch/main/source/blender/editors/curve/editfont.cc#L699-L700) directly uses `len_bytes + 4`. I mostly copied the implementation from rna_curve.cc. The change can be tested by creating texts for all available languages using the following script: ```python import bpy from bl_i18n_utils import settings offset = 0 for _, lang_name, lang_uid in settings.LANGUAGES: try: bpy.context.preferences.view.language = lang_uid bpy.ops.object.text_add(enter_editmode=False, align='WORLD', location=(0, -offset, 0)) bpy.context.active_object.name = f"{offset:03}.Text.{lang_name}" offset += 1 except TypeError: pass ``` ![image](/attachments/7ede852c-aa72-4c98-9242-49e4aff60518)
157 KiB
Damien Picard added the
Module
User Interface
Interest
Modeling
Interest
Translations
labels 2023-11-24 18:11:55 +01:00
Damien Picard added 1 commit 2023-11-24 18:12:07 +01:00
2ece92596e I18n: translate new text object body
After VFont has been updated to use the BLF API in 604ee2d036, it can
use the fallback font stack and display text in many scripts.

This change means the default text for font objects can now be
translated, instead of always being the English word 'Text'.
Translation will only occur if the user has enabled translation of new
data in the preferences.
Damien Picard requested review from Bastien Montagne 2023-11-24 18:17:31 +01:00
Bastien Montagne requested changes 2023-11-27 11:18:27 +01:00
Bastien Montagne left a comment
Owner

Change looks good, but several points to fix in memory allocation indeed.

Admittedly, that 'text Curve' memory handling is about the worst, most confusing one I have ever seen, with a mix of legacy issues, confusion between string length, utf8 strings, utf32 strings, and so on... Could use a serious cleanup.

Will also summon @Harley here, since he's been working in this area recently iirc, he might have better understanding of these things?

Change looks good, but several points to fix in memory allocation indeed. Admittedly, that 'text Curve' memory handling is about the worst, most confusing one I have ever seen, with a mix of legacy issues, confusion between string length, utf8 strings, utf32 strings, and so on... Could use a serious cleanup. Will also summon @Harley here, since he's been working in this area recently iirc, he might have better understanding of these things?
@ -372,0 +368,4 @@
const char *str = DATA_("Text");
size_t len_bytes;
size_t len_chars = BLI_strlen_utf8_ex(str, &len_bytes);

Would rather match the name in the Curve struct for that one: len_char32 (event hough it's not actually a proper name, len_utf8_charpoint or so would be more correct).

Would rather match the name in the `Curve` struct for that one: `len_char32` (event hough it's not actually a proper name, `len_utf8_charpoint` or so would be more correct).
pioverfour marked this conversation as resolved
@ -372,0 +370,4 @@
size_t len_bytes;
size_t len_chars = BLI_strlen_utf8_ex(str, &len_bytes);
cu->str = static_cast<char *>(MEM_mallocN(len_bytes + sizeof(char32_t), "str"));

No reason to remove usage of the _array malloc here (same below btw).

And no reason to add extra char32_t either, you only need one extra char for the NULL terminator.

No reason to remove usage of the `_array` malloc here (same below btw). And no reason to add extra `char32_t` either, you only need one extra char for the NULL terminator.
Author
Member

My understanding was that UTF-8 characters don’t necessarily have the same number of bytes, but the _array malloc could only allocate same-sized arrays. So for Vietnamese, len_bytes == 10 but len_char32 == 7.
If that assumption is wrong, I don’t know how to write this allocation properly…

My understanding was that UTF-8 characters don’t necessarily have the same number of bytes, but the `_array` malloc could only allocate same-sized arrays. So for Vietnamese, `len_bytes == 10` but `len_char32 == 7`. If that assumption is wrong, I don’t know how to write this allocation properly…

MEM_malloc_arrayN(len_bytes + 1, sizeof(char), "str") should do the trick?

cu->str is nothing more than an array of chars, fact that it is utf8-encoded and that some unicode points may require more than one byte is only relevant for data using len_char32/strinfo/...

`MEM_malloc_arrayN(len_bytes + 1, sizeof(char), "str")` should do the trick? `cu->str` is nothing more than an array of chars, fact that it is utf8-encoded and that some unicode points may require more than one byte is only relevant for data using `len_char32/strinfo/...`
Author
Member

It does look like it does the trick, thanks for the help :)

It does look like it does the trick, thanks for the help :)
@ -372,0 +371,4 @@
size_t len_chars = BLI_strlen_utf8_ex(str, &len_bytes);
cu->str = static_cast<char *>(MEM_mallocN(len_bytes + sizeof(char32_t), "str"));
memcpy(cu->str, str, len_bytes + 1);

Use BLI_strncpy

Use `BLI_strncpy`
pioverfour marked this conversation as resolved
@ -372,0 +377,4 @@
cu->len_char32 = cu->pos = len_chars;
cu->strinfo = static_cast<CharInfo *>(
MEM_callocN((len_chars + 4) * sizeof(CharInfo), "strinfo new"));

I do not see any reason to allocate more than len_chars + 1 items here. That's what written at least in blendfiles (although I do not even see the need for the + 1, but since it's written... ;) ).

I do not see any reason to allocate more than `len_chars + 1` items here. That's what written at least in blendfiles (although I do not even see the need for the `+ 1`, but since it's written... ;) ).
pioverfour marked this conversation as resolved
Bastien Montagne requested review from Harley Acheson 2023-11-27 11:18:40 +01:00
Damien Picard added 2 commits 2023-11-27 14:13:57 +01:00
8a802b0fd9 Improve allocations and naming
- Rename `len_chars` to `len_char32`;
- Only allocate `len_bytes + 1` instead of
  `len_bytes + sizeof(char32_t)` for the string;
- Only allocate `len_char32 + 1` instead of
  `len_char32 + 4` for the CharInfo;
- Use `BLI_strncpy()` instead of `memcpy()`;
- Use `MEM_calloc_arrayN` instead of `MEM_callocN()` for the CharInfo.
Damien Picard added 1 commit 2023-11-27 14:34:44 +01:00
Bastien Montagne approved these changes 2023-11-27 14:44:18 +01:00
Bastien Montagne left a comment
Owner

LGTM now, thanks.

LGTM now, thanks.

@blender-bot build

@blender-bot build
Harley Acheson approved these changes 2023-11-27 18:35:03 +01:00
Harley Acheson left a comment
Member

This is awesome. And works great!

This is awesome. And works great!
Bastien Montagne merged commit 5e38f7faf0 into main 2023-11-27 20:44:58 +01:00
Bastien Montagne deleted branch dp_translate_new_text_body 2023-11-27 20:45:00 +01:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#115370
No description provided.