Fix #116280: bpy.ops.wm.url_open() cannot open file:/// #116295

Merged
Dalai Felinto merged 2 commits from dfelinto/blender:fix-url-open into blender-v4.0-release 2023-12-18 16:44:46 +01:00

This issue was introduced on a15c637e63.

This issue was introduced on a15c637e63798f61aebb777aca54acaed3bf562c.
Dalai Felinto added 1 commit 2023-12-18 12:40:27 +01:00
Iliya Katushenock added this to the Python API project 2023-12-18 12:44:51 +01:00
Dalai Felinto requested review from Sergey Sharybin 2023-12-18 12:46:43 +01:00
Dalai Felinto requested review from Ray molenkamp 2023-12-18 12:46:43 +01:00
Author
Owner

I tagged the two original reviewers of the PR as reviewer here. But the change is rather simple.

I tagged the two original reviewers of the PR as reviewer here. But the change is rather simple.
Falk David reviewed 2023-12-18 12:51:29 +01:00
@ -1031,3 +1031,3 @@
# Make sure we have a scheme otherwise we can't parse the url.
if not url.startswith(("http://", "https://")):
if "://" not in url:
Member

I think it's cleaner to check for the expected protocols.
E.g

expected_protocols = ("http://", "https://", "file:///")
if not url.startswith(expected_protocols):
I think it's cleaner to check for the expected protocols. E.g ``` expected_protocols = ("http://", "https://", "file:///") if not url.startswith(expected_protocols): ```
Author
Owner

@filedescriptor but where do we stop? how about gitea://, ftp://, ......

@filedescriptor but where do we stop? how about gitea://, ftp://, ......
Member

@dfelinto Well you can add all the protocols that make sense right? There aren't that many..

@dfelinto Well you can add all the protocols that make sense right? There aren't that many..
Sybren A. Stüvel reviewed 2023-12-18 13:09:54 +01:00
Sybren A. Stüvel left a comment
Member

I think there's a bunch of issues with the surrounding code.

# Make sure we have a scheme otherwise we can't parse the url.

This is misleading, as this works fine:

>>> import urllib.parse

>>> urllib.parse.urlparse('blender')
ParseResult(scheme='', netloc='', path='blender', params='', query='', fragment='')

>>> urllib.parse.urlparse('https://blender')
ParseResult(scheme='https', netloc='blender', path='', params='', query='', fragment='')

You could argue that without scheme the string should go into the netloc field, but that's beside the point. Without schema, the URL can be parsed just fine.

The second issue is the assumption that, if the URL doesn't start with http:// or https:// that the addition of that string will produce a valid URL. This, again, is not true in general. Checking for :// helps, in that it's a more general check and will likely work better, but I don't think it's the right approach either as it can appear in any other part of the URL as well.

Why not use urlparse() to do the parsing for us?

parsed_url = urllib.parse.urlparse(url)
if not parsed_url.scheme:
    url = f'https://{url}'
parsed_url = urllib.parse.urlparse(url)
I think there's a bunch of issues with the surrounding code. ```python # Make sure we have a scheme otherwise we can't parse the url. ``` This is misleading, as this works fine: ```python >>> import urllib.parse >>> urllib.parse.urlparse('blender') ParseResult(scheme='', netloc='', path='blender', params='', query='', fragment='') >>> urllib.parse.urlparse('https://blender') ParseResult(scheme='https', netloc='blender', path='', params='', query='', fragment='') ``` You could argue that without scheme the string should go into the `netloc` field, but that's beside the point. Without schema, the URL can be parsed just fine. The second issue is the assumption that, if the URL doesn't start with `http://` or `https://` that the addition of that string will produce a valid URL. This, again, is not true in general. Checking for `://` helps, in that it's a more general check and will likely work better, but I don't think it's the right approach either as it can appear in any other part of the URL as well. Why not use `urlparse()` to do the parsing for us? ```python parsed_url = urllib.parse.urlparse(url) if not parsed_url.scheme: url = f'https://{url}' parsed_url = urllib.parse.urlparse(url) ```
Member

this feels like a job a regex would excel at?

this feels like a job a regex would excel at?
First-time contributor
parsed_url = urllib.parse.urlparse(url)
if not parsed_url.scheme:
    url = f'https://{url}'
parsed_url = urllib.parse.urlparse(url)

Wouldn't you want to condition that prefix on the parsed URL having an actual netloc as well? Otherwise your 'blender' example is going to turn the path of blender into http://blender where the path becomes the domain.

(I have no idea what this code is driven by, or how invalid the passed args could get, just noticed the potential.)

> ```python > parsed_url = urllib.parse.urlparse(url) > if not parsed_url.scheme: > url = f'https://{url}' > parsed_url = urllib.parse.urlparse(url) > ``` Wouldn't you want to condition that prefix on the parsed URL having an actual netloc as well? Otherwise your 'blender' example is going to turn the path of blender into http://blender where the path becomes the domain. (I have no idea what this code is driven by, or how invalid the passed args could get, just noticed the potential.)
Dalai Felinto added 1 commit 2023-12-18 15:24:00 +01:00
Author
Owner

Incorporated @dr.sybren suggestion with some changes.

Incorporated @dr.sybren suggestion with some changes.

Wouldn't you want to condition that prefix on the parsed URL having an actual netloc as well? Otherwise your 'blender' example is going to turn the path of blender into http://blender where the path becomes the domain.

This confusion is what I was fearing when I wrote "You could argue that without scheme the string should go into the netloc field". For me (and AFAIK all webbrowsers), blender.org is a valid URL and should be interpreted as netloc. That's not how Python's urlparse function works, though.

(I have no idea what this code is driven by, or how invalid the passed args could get, just noticed the potential.)

I suspect that there's only one use case for this particular piece of code, and that's turning a sheme-less URL into one with an explicit scheme. And so the logic of "if there is no scheme, chuck https:// in front of it" seems pretty stable. I doubt there will be any scheme-relative (//blender.org/path) URLs to handle.

Incorporated @dr.sybren suggestion with some changes.

👍 LGTM!

> Wouldn't you want to condition that prefix on the parsed URL having an actual netloc as well? Otherwise your 'blender' example is going to turn the path of blender into http://blender where the path becomes the domain. This confusion is what I was fearing when I wrote "You could argue that without scheme the string should go into the `netloc` field". For me (and AFAIK all webbrowsers), `blender.org` is a valid URL and should be interpreted as `netloc`. That's not how Python's `urlparse` function works, though. > (I have no idea what this code is driven by, or how invalid the passed args could get, just noticed the potential.) I suspect that there's only one use case for this particular piece of code, and that's turning a sheme-less URL into one with an explicit scheme. And so the logic of "if there is no scheme, chuck `https://` in front of it" seems pretty stable. I doubt there will be any scheme-relative (`//blender.org/path`) URLs to handle. > Incorporated @dr.sybren suggestion with some changes. :+1: LGTM!
Author
Owner

I'm taking Sybren's review as final. Thanks everyone for pitching in

I'm taking Sybren's review as final. Thanks everyone for pitching in
Dalai Felinto merged commit 63e9cead5f into blender-v4.0-release 2023-12-18 16:44:46 +01:00
Dalai Felinto deleted branch fix-url-open 2023-12-18 16:44:50 +01:00
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset System
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Code Documentation
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Viewport & EEVEE
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Asset Browser Project
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Module
Viewport & EEVEE
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Severity
High
Severity
Low
Severity
Normal
Severity
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
5 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#116295
No description provided.