Cleanup: simplify keyword recognition in text editor #108775
No reviewers
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#108775
Loading…
Reference in New Issue
No description provided.
Delete Branch "guishe:simplify-keywords"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Arrays are used to eliminate the use of many nested if else's that
compare a string value against to many other strings.
These arrays are sorted to be able to perform binary searches on these
string literals arrays.
@ideasman42 following with your commit to move text editor files to C++, I simplified the way keywords are recognized, this way we get rid of the error made by many chained if else
What do you think about making it a constexpr class/methods?
Edit: Although ... is it really necessary to do this not with arrays and simple loops?
KEYWORD_INFO is a macro to expand the string and its size in a keyword_info type structure, the lists of keywords will never be modified when executing the program, so I see that it is possible to calculate it at compile time the array of the keywords and their sizes
Although the constexpr specifier can be deduced by the compiler in this cases.
In the end, the compiler would optimize and it would be the same to have it as an ifelse or a for loop in an array, but the array style is more maintainable over time.
Edit: And a 15.4 KB less space
While this change is an improvement, I think it would be better to further refactor the checks so the lookup is generalized.
KeywordInfo
arrays (for example).This has the advantage that the method of looking up text can be changed (to a hash table for e.g.) else without large changes each time.
I think it's ready for you to take another look at it @ideasman42
It possiblt to replace
KeywordInfo
byStringRef
from bli?Yeah, much better
static Array<StringRef> texts = {"abc", "ddd", "fff"};
?It will be fine too
But if we are going to use other structures for look up it does not make much difference, but if it is by preference then I can make the change
Regarding the underlying data structure
static Array<StringRef>
is fine too, the main point is that we don't have to refactor the large lists every time a lookup method changes.Other requests:
text_format_py_keyword_literals
text_format_py_builtinfunc_literals
text_format_py_specialvar_literals
... same for other languages.
I use the same
static Array<StringRef>
array to perform thelookups
instead of creating a newVector
, I sort them too (on start up)Also fixed a bug (happens in main branch), before if an
elseif
was written for example, it stopped comparing withelse
because it only compares thesize
of the literals, and sinceelseif
has extra characters then with the evaluation oftext_check_identifier(text[i])
it was discardedI don't know why some are formated like this
And other like this
Should I disable formatting and keep it in one column?
Please keep the current functions in-place, the large else-if blocks containing
STR_LITERAL_STARTSWITH(..)
can be replaced with a generalized lookup but there is a check at the endtext_check_identifier
, which I think should stay in the control of the caller and not generalized (languages might have different definitions of what an identifier is).Prefer to force single column formatting, so changes to keywords produce usable diff's.
@ -10,6 +10,8 @@
#include "MEM_guardedalloc.h"
#include "BKE_text.h"
This isn't needed.
@ -243,0 +251,4 @@
auto comp_func = [literal_startwith](const StringRef &string_literal, const char *text) {
int result = literal_startwith(string_literal, text);
return result < 0 || (result == 0 && text_check_identifier(text[string_literal.size()]));
The definition of an identifier should not be part of a general literal lookup, this function, I don't have such a strong opinion on how this is solved, only that spesific details shouldn't be in generic function shared by different languages.
A possible solution would be to add a callback to
find_string_literal
that takes theStringRef &
and can check if it should be used or not.I did it this way in case we have something like
[..."else","elseif",...]
if we write in the text editor
elseif
, comparing only withliteral_startwith(string_literal, text)
we can getelse
as result.Edited: whit
upper_bound
solve the issue, and avoid the use oftext_check_identifier
.Just improved the method, it was easier than I expected
@ -117,3 +122,2 @@
#define STR_LITERAL_STARTSWITH(str, str_literal, len_var) \
(strncmp(str, str_literal, len_var = (sizeof(str_literal) - 1)) == 0)
/*
Use
/** .. */
doxygen style comment, no need for blank line at the end.@ -243,0 +244,4 @@
const StringRef *find_string_literal(const Array<StringRef> &string_literals, const char *text)
{
auto string_literal = std::upper_bound(
string_literals.begin(), string_literals.end(), StringRef(text));
Should use StringRef(text, 0) to avoid calculating the string size with StringRef(text)?
size is not needed when comparing StringRef
I don't have a strong opinion on this although it seems like a mis-use of the API. I'd like to see the issue with blender asserting on startup solved first as it might change how the arrays are declared, although this is more likely to be related to
Array
use thanStringRef
.Attached updated version of this patch with some opinionated changes.
NOTE: This patch currently causes Blender not to start when launched with
blender -d
seems allocating before guarded-alloc is properly initialized is causing issues. This needs to be investigated / resolved.blender -d
asserting on startup needs to be resolved.Moved arrays from text_format_osl.cc to simple arrays fixes 2 memory blocks, i will try with Vector
Tested with vector and same issue
Almost all files moved to simple arrays and almost all errors fixed
@guishe
Also:
StringRef
already have==
operator.Use
Span
to avoid passing containers for read (const Array<> &src
->const Span<> src
).StringRef
is a ref type, can passed by value. For be optionaly, meybestd::optional<StringRef>
?I converted it to a simple array, tomorrow I will be able to continue trying other ways
Now it not crash on start whit
blender -d
, and now I realize that before with Array blender last longer to start , like 3-5 seconds moreWould try this tomorrow
Now these removed functions are templates in the .hh file
With std::vector the error is not reproduced
The problem is with blender::vector and blender::array
in this way the Array must already be sorted, or:
If runtime sort is so important, then:
works, i use this way to be able to sort only on startup
And use
Span<StringRef>
as argument fortext_format_string_literal_find
In this way we can have in code groups of strings like in the imagen, and in startup we order them to make faster searches
Seems like the search speed would be better if you check the words by their popularity of usage in the language? There is a complicated heuristic here, is it necessary to pay attention to this now?
Also, are you in the blender chat?
a binary search is done(possible because it is ordered), I think it's faster that way
Yes
@blender-bot build
Please rebase on
main
(there seems to be some conflicts), the buildbot fails to build it.Requesting changes, also, did you comapre the speed of syntax highlighting before after this change?
@ -243,0 +244,4 @@
const int text_format_string_literal_find(const Span<const char *> string_literals,
const char *text)
{
auto predicate_func = [](const char *text, const char *string_literal) {
Prefer the shorter
cmp_fn
.@ -243,0 +262,4 @@
return -1;
}
const bool text_format_string_literals_check_sorted_array(
As this function is only for debug builds it can be enclosed in
#ifndef NDEBUG
(same for declaration).@ -7,6 +7,11 @@
*/
#pragma once
#include "BLI_array.hh"
array & string_ref headers aren't needed.
@ -33,0 +91,4 @@
/** Python special name.*/
static const char *text_format_py_literals_specialvar_data[]{
/* Prevent wrapping onto multiple lines, avoid re-wrapping when values change. */
Best remove the
Prevent wrapping onto multiple lines
- as allclang-format off
are meant to be noted why, use a short.No need for a detailed description.
I tried and i force to run multiple times in a frame run but not makes enough difference, maybe arrays are not too large enough for a big improved in time, or the compiler is too good enough to make a simple search in the chained if/else.
But anyway, using an sorted array doesn't make performance worse.
@blender-bot build
If the build-bot doesn't show any errors, I'll commit with minor changes.
string_literal
&string_literal
was used to refer to bothconst char *
andconst char *const *
types which is fairly confusing (obscured byauto
, rename one tostring_literal_p
).strlen()
call.Committed
3f26bdf840
.Pull request closed