Commit Graph

103 Commits

Author SHA1 Message Date
2f7258d618 Fix BLI_strncpy_wchar_from_utf8 result on Windows
This function was documented to return the length but returned an
error value for WIN32. While this doesn't cause any bugs at the moment,
it could cause problems in the future.

Oversight in 5496d8cd36.
2021-08-29 12:11:09 +10:00
457302b67b BLI_string_utf8: add buffer size arg to BLI_str_utf8_from_unicode
Besides helping to avoid buffer overflow errors this reduces complexity
of BLI_str_utf32_as_utf8 which needed a special loop for the last 6
characters to avoid writing past the buffer bounds.

Also add BLI_str_utf8_from_unicode_len which only returns the length.
2021-08-28 22:50:52 +10:00
89dae554f9 Cleanup: utf8 stepping functions
Various changes to reduce risk of out of bounds errors in utf8 seeking.

- Remove BLI_str_prev_char_utf8
  This function could potentially scan past the beginning of a string.
  Use BLI_str_find_prev_char_utf8 instead which takes a limiting
  string start argument.

- Swap arguments for BLI_str_find_prev_char_utf8 so the stepping
  argument is first and the limiting argument is last.
  This matches BLI_str_find_next_char_utf8.

- Change behavior of these functions to return it the start or end
  pointers instead of NULL, which complicated use of these functions
  to calculate offsets.

  Callers that need to check if the limits were reached can compare
  the return value with the start/end pointers.

- Return 'const char *' from these functions
  so they don't remove const from the input arguments.
2021-08-27 17:02:53 +10:00
da17692a3d Cleanup: use BLI_UTF8_MAX define 2021-08-26 20:41:02 +10:00
820d50d3cb Correct error in 38630711a0 2021-08-25 15:36:40 +10:00
38630711a0 BLI_string_utf8: remove unnecessary utf8 decoding functions
Remove BLI_str_utf8_as_unicode_and_size and
BLI_str_utf8_as_unicode_and_size_safe.

Use BLI_str_utf8_as_unicode_step instead since it takes
a buffer bounds argument to prevent buffer over-reading.
2021-08-25 15:28:59 +10:00
be906f44c6 BLI_string_utf8: simplify utf8 stepping logic
There were multiple utf8 functions which treated
errors slightly differently.

Split BLI_str_utf8_as_unicode_step into two functions.

- BLI_str_utf8_as_unicode_step_or_error returns error value
  when decoding fails and doesn't step.

- BLI_str_utf8_as_unicode_step always steps forward at least one
  returning the byte value without decoding
  (needed to display some latin1 file-paths).

Font drawing uses BLI_str_utf8_as_unicode_step and no longer
check for error values.
2021-08-25 15:27:18 +10:00
8b55cda048 Fix BLI_str_utf8_as_unicode_step reading past intended bounds
Add a string length argument to BLI_str_utf8_as_unicode_step to prevent
reading past the buffer bounds or the intended range since some callers
of this function take a string length to operate on part of the string.

Font drawing for example didn't respect the length argument,
potentially causing a buffer over-read with multi-byte characters
that could read past the end of the string.

The following command would read 5 bytes past the end of the input.

`BLF_draw(font_id, (char[]){252}, 1);`

In practice strings are typically null terminated so this didn't crash
reading past buffer bounds.

Nevertheless, this wasn't correct and could cause bugs in the future.

Clamping by the length now has the same behavior as a null byte.

Add test to ensure this is working as intended.
2021-08-24 14:25:15 +10:00
aa067bef5e Cleanup: use BLI_str_utf8 prefix
Rename:

- BLI_str_utf8_invalid_byte  (was BLI_utf8_invalid_byte)
- BLI_str_utf8_invalid_strip (was BLI_utf8_invalid_strip)
2021-08-23 15:02:13 +10:00
9b89de2571 Cleanup: consistent use of tags: NOTE/TODO/FIXME/XXX
Also use doxy style function reference `#` prefix chars when
referencing identifiers.
2021-07-04 00:43:40 +10:00
4b9ff3cd42 Cleanup: comment blocks, trailing space in comments 2021-06-24 15:59:34 +10:00
17e1e2bfd8 Cleanup: correct spelling in comments 2021-02-05 16:23:34 +11:00
5496d8cd36 Windows: Fix wchar_t truncation
BLI_strncpy_wchar_from_utf8 made the assumption that
wchar_t is UTF-32 bit regardless of environment, while
this holds true on both mac and linux, on windows
wchar_t is actually actually UTF-16.

This resulted in the upper 16 bits being dropped from
from some string conversions and prevented blender
from starting when installed in a path with unicode
code-points over 0xffff.

There was also a fair bit of code duplication between
BLI_strncpy_wchar_from_utf8 and BLI_str_utf8_as_unicode_and_size

this change essentially removes all logic from
BLI_strncpy_wchar_from_utf8 and calls the right function
for the right environment.

Reviewed By: brecht , Robert Guetzkow

Differential Revision: https://developer.blender.org/D9822
2021-01-26 14:56:39 -07:00
a29686eeb3 Cleanup: Blenlib, Clang-Tidy else-after-return fixes (incomplete)
This addresses warnings from Clang-Tidy's `readability-else-after-return`
rule in the `source/blender/blenlib` module. Not all warnings are
addressed in this commit.

No functional changes.
2020-08-07 11:23:02 +02:00
2d1cce8331 Cleanup: make format after SortedIncludes change 2020-03-19 09:33:58 +01:00
fb330dd110 Cleanup: remove unused BLI_strncat_utf8
Behaves differently to strncat,
BLI_strncpy_utf8_rlen can be used for a similar purpose.
2020-03-04 15:24:06 +11:00
95d0e04ed1 BLI: fix utf8 character counting when there is an incomplete utf8 char
D6923 by Kim Geonwoo
2020-02-28 14:26:07 +01:00
6f3e498e7d Cleanup: use of 'unsigned'
- Replace 'unsigned' used on it's own with 'uint'.
- Replace 'unsigned const char' with 'const uchar'.
2020-02-08 01:02:18 +11:00
177dfc6384 Fix T71273: Bad encoding of utf-8 for Text objects
`BLI_strncpy_wchar_from_utf8` internally assumes `wchar_t` is 32 bits
which is not the case on windows.

The solution is to replace `wchar_t` with `char32_t`.

Thanks to @robbott for compatibility on macOS.

Differential Revision: https://developer.blender.org/D6198
2019-11-22 12:27:34 -03:00
0b2d1badec Cleanup: use post increment/decrement
When the result isn't used, prefer post increment/decrement
(already used nearly everywhere in Blender).
2019-09-08 00:23:25 +10:00
dcad1eb03c Cleanup: move utf8 offset conversion into BLI_string_utf8
There isn't anything specific to text data with these functions.
2019-08-06 21:59:13 +10:00
e12c08e8d1 ClangFormat: apply to source, most of intern
Apply clang format as proposed in T53211.

For details on usage and instructions for migrating branches
without conflicts, see:

https://wiki.blender.org/wiki/Tools/ClangFormat
2019-04-17 06:21:24 +02:00
9ba948a485 Cleanup: style, use braces for blenlib 2019-03-27 13:17:30 +11:00
e7fd6c8f30 Cleanup: comment blocks 2019-03-19 15:17:46 +11:00
de13d0a80c doxygen: add newline after \file
While \file doesn't need an argument, it can't have another doxy
command after it.
2019-02-18 08:22:12 +11:00
48a36ff9e9 Cleanup: manually apply changes missed last commit
Automatic edits failed for indented comment blocks,
removed indentation & adjusted.
2019-02-06 15:52:04 +11:00
eef4077f18 Cleanup: remove redundant doxygen \file argument
Move \ingroup onto same line to be more compact and
make it clear the file is in the group.
2019-02-06 15:45:22 +11:00
744f633986 Cleanup: trailing commas
Needed for clan-format not to wrap onto one line.
2019-02-03 14:59:11 +11:00
65ec7ec524 Cleanup: remove redundant, invalid info from headers
BF-admins agree to remove header information that isn't useful,
to reduce noise.

- BEGIN/END license blocks

  Developers should add non license comments as separate comment blocks.
  No need for separator text.

- Contributors

  This is often invalid, outdated or misleading
  especially when splitting files.

  It's more useful to git-blame to find out who has developed the code.

See P901 for script to perform these edits.
2019-02-02 01:36:28 +11:00
4226ee0b71 Cleanup: comment line length (blenlib)
Prevents clang-format wrapping text before comments.
2019-01-15 23:30:31 +11:00
e757c4a3be Cleanup: use colon separator after parameter
Helps separate variable names from descriptive text.
Was already used in some parts of the code,
double space and dashes were used elsewhere.
2018-12-12 12:50:58 +11:00
8ac69ff9dc Cleanup: use uint type in BLI 2017-10-28 17:48:45 +11:00
72c9141a7a Cleanup: doxygen comments
Also remove duplicate & mismatching comments from grease-pencil header.
Keep comments close to implementation to avoid getting out of sync.
2017-06-19 10:04:30 +10:00
81e584ed17 CMake: Use GCC7's -Wimplicit-fallthrough=5
Use to avoid accidental missing break statements,
use ATTR_FALLTHROUGH to suppress.
2017-05-20 14:01:03 +10:00
82187a58f5 Fix own mistake in rB051526da6279, confusing off_t with ptrdiff_t. 2017-01-20 21:57:48 +01:00
051526da62 Cleanup/fix some BLI_string_utf8 not using size_t/off_t as expected. 2017-01-20 16:51:05 +01:00
ff0221f5d8 Fix implicit size_t to int conversion.
Seems like it was erroring on some buildbots...
2017-01-03 15:30:59 +01:00
adadfaad88 Fix (unreported) fully broken 'sanitize utf-8' helper.
That code was a joke, letting some invalid utf8 bytes pass, returning
wrong offset for some invalid sequences, not to mention length and
pointer easily going out of sync, NULL final byte being 'forgotten' by
memcpy, etc. etc.

The miracle here is that we could survive using this for so long!
Probably because we do not use utf-8 sanitizing enough in Blender,
actually... :/
2017-01-01 02:15:42 +01:00
e170d6be7f Cleanup: all params of BLI_str partition funcs can be const... 2015-06-27 11:00:47 +02:00
4d043c99dc Extend BLI_str_partition_ex: add possibility to define a right limit to the string.
Now you can define `end` pointer as right limit of the string (allows to easily search
in substring, especially useful when searching from right).
2015-06-27 10:24:54 +02:00
5d30c23c35 doxygen: corrections/updates
Also add depsgraph & physics
2015-05-20 14:12:22 +10:00
12f60e7825 Fix T43834: Text object looses one char after another by entering/leaving edit mode.
Own mistake in refactoring of `BLI_strncpy_wchar_as_utf8()`, if given size was exactly
the one needed, we'd lost last char (off-by-one error).

Many thanks to plasmasolutions (Thomas Beck) who found the issue and did
all the investigation work here!
2015-02-27 21:31:54 +01:00
88facb8876 Fix potential buffer overflow in BLI_strncpy_wchar_as_utf8(). 2015-02-26 11:27:58 +01:00
aab4f2b762 cleanup: redundant casts & const cast correctness 2015-01-01 23:42:28 +11:00
00197a668b BLI_string_utf8: add BLI_strncpy_utf8_rlen 2014-12-28 16:00:08 +11:00
e3c8cf0a9e Add (r)partition funcs to BLI_string, to get left-most/right-most first occurence of delimiters.
Inspired by Python (r)partition str functions. Also added some Gtest cases for those new funcs.

Reviewed by Campbell Barton, many thanks!
2014-07-04 14:14:06 +02:00
483d8da9bc Code cleanup: use 'const' for arrays (blenlib) 2014-04-27 00:25:15 +10:00
f0ca40f9c1 Code cleanup: function calls 2014-03-25 10:10:00 +11:00
2097e621ed Code cleanup: use r_ prefix for return args 2014-03-16 03:26:23 +11:00
62aa004c25 Style Cleanup: whitespace 2014-01-12 22:05:24 +11:00