Docs: improve online manual lookup time time #104581

Closed
Erik Abrahamsson wants to merge 3 commits from erik85/blender:online-help-improve-time into main

When changing the target branch, be careful to rebase the branch in your fork to match. See documentation.

Matching the RNA id's to the search pattern is slow because
of the function fnmatchcase. This patch first checks the string
prefix without any special characters used by fnmatch,
if the startswith check fails, there is no need to check fnmatchcase.
Before the optimization, an online manual lookup took about 400ms
which is quite noticeable, with this patch applied it's under 10ms.

Matching the RNA id's to the search pattern is slow because of the function `fnmatchcase`. This patch first checks the string prefix without any special characters used by fnmatch, if the `startswith` check fails, there is no need to check `fnmatchcase`. Before the optimization, an online manual lookup took about 400ms which is quite noticeable, with this patch applied it's under 10ms.
Erik Abrahamsson requested review from Campbell Barton 2023-02-10 18:28:06 +01:00
Aaron Carlisle requested review from Aaron Carlisle 2023-02-10 19:08:11 +01:00

This change only supports checks for the * character, fnmatch also supports characters within brackets & ?, so I don't think these kinds of optimizations are practical.

However 400ms for a lookup is on the slow side and it would be good to improve performance. (On my system it's around 0.32 seconds).

This is the test script I used.

import time, bpy
t = time.time()
x = bpy.types.WM_OT_doc_view_manual._lookup_rna_url("test.that.never.succeeds")
print(time.time() - t, x)

Some possibilities:

  • Expose C's fnmatch (see source/blender/blenlib/BLI_fnmatch.h). While this seems like it could be involved it may be the most straightforward solution. (see attached fast_fnmatch.diff). In my tests this is significantly faster (0.002 seconds). The main down-side with this is it increases the number of API calls we need to maintain and is likely to be different to Python's fnmatch is subtle ways.

  • Use Python's multi-processing to perform multiple matches at once (see attached mp_fnmatch.diff). While almost twice as fast, ~0.17 seconds which is still relatively slow. I'd also want to be sure this works well on MS-Windows (which handles multiprocessing differently). There is also some room for problems if for some reason Python can't find it's own binary to launch, it may be slow on cold-starts too.

  • The underlying storage could be changed to a format that loads faster (in this particular case, using REGEX literals avoids having to translate the strings). I'm not so keen on this as it means add-ons that generate this mapping will need to use REGEX or we would need a way to select between one or the other.

  • It's also possible to cache converted data without exposing that to the API, again though, it's a reasonable amount of effort where simpler solutions exist.

This change only supports checks for the `*` character, `fnmatch` also supports characters within brackets & `?`, so I don't think these kinds of optimizations are practical. However 400ms for a lookup is on the slow side and it would be good to improve performance. (On my system it's around 0.32 seconds). This is the test script I used. ``` import time, bpy t = time.time() x = bpy.types.WM_OT_doc_view_manual._lookup_rna_url("test.that.never.succeeds") print(time.time() - t, x) ``` Some possibilities: - Expose C's fnmatch (see `source/blender/blenlib/BLI_fnmatch.h`). While this seems like it could be involved it may be the most straightforward solution. (see attached `fast_fnmatch.diff`). In my tests this is significantly faster (0.002 seconds). The main down-side with this is it increases the number of API calls we need to maintain and is likely to be different to Python's fnmatch is subtle ways. - Use Python's multi-processing to perform multiple matches at once (see attached `mp_fnmatch.diff`). While almost twice as fast, ~0.17 seconds which is still relatively slow. I'd also want to be sure this works well on MS-Windows (which handles multiprocessing differently). There is also some room for problems if for some reason Python can't find it's own binary to launch, it may be slow on cold-starts too. - The underlying storage could be changed to a format that loads faster (in this particular case, using REGEX literals avoids having to translate the strings). I'm not so keen on this as it means add-ons that generate this mapping will need to use REGEX or we would need a way to select between one or the other. - It's also possible to cache converted data without exposing that to the API, again though, it's a reasonable amount of effort where simpler solutions exist.
Campbell Barton requested changes 2023-02-11 06:14:26 +01:00
Campbell Barton left a comment
Owner

As noted in reply, this will break support for some fnmatch features.

As noted in reply, this will break support for some `fnmatch` features.

Turns out I was too quick to dismiss the approach from this patch, while the check for * isn't sufficient, fnmatch is quite simple so a str.startswith check is valid as long as the check involves the prefix before any (*, ?, [) characters. See attached startswith_fnmatch.diff, this runs in ~0.004 seconds.


Although a faster fnmatch might still be worth having in other contexts.

Turns out I was too quick to dismiss the approach from this patch, while the check for `*` isn't sufficient, `fnmatch` is quite simple so a `str.startswith` check is valid as long as the check involves the prefix before any (`*`, `?`, `[`) characters. See attached `startswith_fnmatch.diff`, this runs in ~0.004 seconds. --- Although a faster `fnmatch` might still be worth having in other contexts.
Erik Abrahamsson force-pushed online-help-improve-time from 17dacf1315 to 0300c7b6d0 2023-02-11 12:11:39 +01:00 Compare
Author
Member

That's nice. I modified your patch a tiny bit to take care of None returned from match, and typos.

That's nice. I modified your patch a tiny bit to take care of None returned from match, and typos.
Erik Abrahamsson requested review from Campbell Barton 2023-02-11 12:18:10 +01:00
Campbell Barton force-pushed online-help-improve-time from 0300c7b6d0 to 10a7bf6c01 2023-02-12 02:16:33 +01:00 Compare
Campbell Barton requested changes 2023-02-12 02:25:27 +01:00
@ -1252,0 +1255,4 @@
# If any of these characters are used we must let `fnmatch` run its own matching logic.
# However, in most cases a literal prefix is used making it considerably faster to do a
# simple `startswith` check before performing a full match, see #104581.
re_match_non_special = re.compile("^[^?*\\[]+").match

* should be escaped, suggest: r"^[^?\*\[]+" - with a comment for clarity.

# Match any character not in `*?[`.
re_match_non_special = re.compile(r"^[^?\*\[]+").match
`*` should be escaped, suggest: `r"^[^?\*\[]+"` - with a comment for clarity. ``` # Match any character not in `*?[`. re_match_non_special = re.compile(r"^[^?\*\[]+").match ```
erik85 marked this conversation as resolved
@ -1252,1 +1260,4 @@
for pattern, url_suffix in url_mapping:
# Simple optimization, makes a big difference (over 50x speedup).
non_special = re_match_non_special(pattern)
if (non_special is not None and non_special.end() > 0 and

Avoid calling non_special.end() twice, either assign a variable or remove, as (not rna_id.startswith(pattern[:non_special.end()]))) on it's own works as expected without the need to check end() > 0.

Avoid calling `non_special.end()` twice, either assign a variable or remove, as `(not rna_id.startswith(pattern[:non_special.end()])))` on it's own works as expected without the need to check `end() > 0`.
erik85 marked this conversation as resolved
Erik Abrahamsson requested review from Campbell Barton 2023-02-12 23:02:50 +01:00
Brecht Van Lommel added this to the User Interface project 2023-02-13 09:22:30 +01:00
Campbell Barton changed title from Online help: Improve docs search time to Docs: improve online manual lookup time time 2023-02-13 10:48:47 +01:00
Campbell Barton force-pushed online-help-improve-time from 9d653ccabc to fb2e6d48e7 2023-02-13 10:51:29 +01:00 Compare
Erik Abrahamsson changed title from Docs: improve online manual lookup time time to Docs: improve online manual lookup time 2023-02-13 10:57:06 +01:00
Erik Abrahamsson changed title from Docs: improve online manual lookup time to Docs: improve online manual lookup time time 2023-02-13 10:57:33 +01:00

Thanks, committed 526f2273c6 with some expansion on code-comments.

Thanks, committed 526f2273c6fc813ff4d1bb6066f445dfb9c88e05 with some expansion on code-comments.
Campbell Barton closed this pull request 2023-02-14 00:53:41 +01:00

Pull request closed

Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#104581
No description provided.