giturlparse icon indicating copy to clipboard operation
giturlparse copied to clipboard

The parser validates non valid urls

Open jubnl opened this issue 1 year ago • 0 comments

Description

The parser should be stronger, there's some urls that are valid for the lib should not be valid. See code/output below

Steps to reproduce

from pprint import pprint

import giturlparse

if __name__ == "__main__":

    urls = [
        "https://github(com../testing2/jubnl/test",
        "https://github$com/testing2/ jubnl/test ",
        "https://git/test",
        "https://git...com/jubnl",
    ]

    for url in urls:
        parsed = giturlparse.parse(url)
        print(f"Initial url: '{url}'")
        print(f"Is url valid: {parsed.valid}")
        if parsed.valid:
            print(f"Parsed urls:")
            pprint(parsed.urls)
C:\Users\user\PycharmProjects\multiproc\.venv\Scripts\python.exe C:\Users\user\PycharmProjects\multiproc\main.py 
Initial url: 'https://github(com../testing2/jubnl/test'
Is url valid: True
Parsed urls:
{'git': 'git://github(com../testing2/jubnl/test.git',
 'https': 'https://github(com../testing2/jubnl/test.git',
 'ssh': 'git@github(com..:testing2/jubnl/test.git'}
Initial url: 'https://github$com/testing2/ jubnl/test '
Is url valid: True
Parsed urls:
{'git': 'git://github$com/testing2/ jubnl/test .git',
 'https': 'https://github$com/testing2/ jubnl/test .git',
 'ssh': 'git@github$com:testing2/ jubnl/test .git'}
Initial url: 'https://git/test'
Is url valid: True
Parsed urls:
Traceback (most recent call last):
  File "C:\Users\user\PycharmProjects\multiproc\main.py", line 107, in <module>
    pprint(parsed.urls)
           ^^^^^^^^^^^
  File "C:\Users\user\PycharmProjects\multiproc\.venv\Lib\site-packages\giturlparse\result.py", line 102, in urls
    return {protocol: self.format(protocol) for protocol in self._platform_obj.PROTOCOLS}
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\PycharmProjects\multiproc\.venv\Lib\site-packages\giturlparse\result.py", line 102, in <dictcomp>
    return {protocol: self.format(protocol) for protocol in self._platform_obj.PROTOCOLS}
                      ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\PycharmProjects\multiproc\.venv\Lib\site-packages\giturlparse\result.py", line 73, in format
    return self._platform_obj.FORMATS[protocol] % items
           ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'http'

Process finished with exit code 1

Versions

Python 3.11.4 giturlparse 0.12.0

Windows 11

Expected behaviour

The parser should not validate those kind of url

Actual behaviour

The parser validated the urls

jubnl avatar Jan 19 '24 13:01 jubnl