selenium icon indicating copy to clipboard operation
selenium copied to clipboard

[πŸš€ Feature]: [py] Validate URL's before navigation

Open cgoldberg opened this issue 2 months ago β€’ 6 comments

Description

When you call driver.get() or driver.browsing_context.navigate()), it attempts to navigate to a URL, even if the URL is malformed.

Browsers don't handle this very well. For example...

if you do:

driver.get("example.com")

or

driver.get("http//example.com")

Chrome will just not navigate and not return any error (Firefox returns an error).

Proposed change:

If we validate the URL before attempting navigation, we can raise a useful exception: raise InvalidArgumentException("Invalid URL").

Here is some example code for validation:

from urllib.parse import urlparse

def is_valid_url(url):
    try:
        result = urlparse(url)
        return bool(result.scheme)
    except AttributeError:
        return False

This validates it can be parsed as a URL and contains a scheme.

Have you considered any alternatives or workarounds?

No response

Does this apply to specific language bindings?

Python

What part(s) of Selenium does this relate to?

No response

cgoldberg avatar Nov 20 '25 14:11 cgoldberg

@cgoldberg, thank you for creating this issue. We will troubleshoot it as soon as we can.

Selenium Triage Team: remember to follow the Triage Guide

selenium-ci avatar Nov 20 '25 14:11 selenium-ci

This is what the spec says:

If URL is not an absolute URL or is not an absolute URL with fragment or not a local scheme, return error with error code invalid argument.

Technically users can create valid local schemes of their own that aren't in VALID_URL_SCHEMES so I don't think we should prevent those. Verify fragments (https://www.example.com/documentation.html#installation) and queries (https://www.example.com/documentation.html?foo=bar) pass the parse (I suspect they do)

titusfortner avatar Nov 20 '25 16:11 titusfortner

users can create valid local schemes of their own that aren't in VALID_URL_SCHEMES

So I guess we can't validate the URL scheme.. maybe it should just try to parse the URL and verify it has a scheme and netloc and let everything else through.

The validation would just be:

def is_valid_url(url):
    try:
        result = urlparse(url)
        return all([result.scheme, result.netloc])
    except AttributeError:
        return False

cgoldberg avatar Nov 20 '25 17:11 cgoldberg

about:blank is a valid url within Chrome but fails the check above.

emanlove avatar Nov 21 '25 01:11 emanlove

@emanlove thanks.. you're right.

We could use:

def is_valid_url(url):
    try:
        result = urlparse(url)
        return bool(result.scheme)
    except AttributeError:
        return False

... that's not a lot of validation, but it would save users from being confused when driver.get(example.com) doesn't navigate or raise an exception in Chrome/Edge.

I'm not sure this is even worth doing though.

cgoldberg avatar Nov 21 '25 02:11 cgoldberg

I think it would be better to leave that decision to end user implementing selenium. But this could make a good blog or knowledge article for community.

rpallavisharma avatar Nov 24 '25 05:11 rpallavisharma

Now that we’re moving to bidi default in Selenium 5 we could possibly check for error codes in driver .get()

shbenzer avatar Dec 11 '25 19:12 shbenzer