url-normalize issues

Fix relative path with a file in between

The correct answer for `http://www.foo.com/foo/bar.html/../bar.html` is `http://www.foo.com/bar.html`, but it returns `http://www.foo.com/foo/bar.html` instead, regarding `bar.html` as a folder.

huyiwen

Use idna package for normalize host

2

Add idna package to dependencies. Remove python 2.7 from supported python versions.

moeb1376

Twitter hashtag search breaks on normalization

Here's a sample Twitter search with a hashtag: `https://twitter.com/search?q=%23cncmachining&src=typed_query` When I run it through `url_normalization`, the encoded hash character (`%23`) is decoded into a hash (`#`), but it should stay...

samuelclay

Some tests related with flake8 are falling

Hi, I am showing some tests are falling with python-url-normalize-1.4.3. Investigating working in progress. Some hint will be appreciated. ```python :: Checking for conflicts... :: Checking for inner conflicts... [Repo...

carlosal1015

Wrong explicitly specified port normalization

Hi, nice normalizer, but in main example we see: ``` from url_normalize import url_normalize print(url_normalize('www.foo.com:80/foo')) https://www.foo.com/foo ``` But, wait a minute... HTTP port is 80 HTTPS port is 443 Normalize...

KirillArtemenko

Incorrect handling of reserved characters

2

The behaviour of `url_normalize` is wrong when dealing with URLs that have encoded characters like question marks `?` or hash symbols `#` in the path. `%23` and `%3F` should not...

deains

Stripping/removing URL parameters

1

It's stripping url parameters `url = 'https://www.example.com/xx/path/slug-whatever?atag=1234de&utm_medium=affiliates&utm_source=whatever_5443de' print(url_normalize(url,sort_query_params=True))` https://www.example.com/xx/path/slug-whatever?atag=1234de&utm_medium=affiliates `print(url_normalize(url,sort_query_params=False))` https://www.example.com/xx/path/slug-whatever?atag=1234de&utm_medium=affiliates

alexeiramone

Add type hints

1

This adds type hints to the entire tree. I'm not familiar with poetry, so not quite sure I got the dependencies right: with 3.6+, typing is in stdlib so no...

scop

Normalize absolute urls with default domain name.

1

It should be possible to specify a `default_scheme` similar to the already possible `default_scheme`. E.g. ```python >>> url_normalize('/foo.png', default_scheme='https', default_domain='example.com') 'https://example.com/foo.png' ``` Currently that is not possible: ```python >>> url_normalize('/foo.png',...

luckydonald

http:excample.com becomes https:///example.com (tripple slash) after normalization

As mentioned in the title: for some reason URLs starting with http: without slashes (which is a valid format if I am informed correctly) get normalized with a tripple slash...

Passimist

url-normalize
url-normalize copied to clipboard

Metadata

Fix relative path with a file in between

Use idna package for normalize host

Twitter hashtag search breaks on normalization

Some tests related with flake8 are falling

Wrong explicitly specified port normalization

Incorrect handling of reserved characters

Stripping/removing URL parameters

Add type hints

Normalize absolute urls with default domain name.

http:excample.com becomes https:///example.com (tripple slash) after normalization

← Metadata

Owner

Metadata

url-normalize url-normalize copied to clipboard

Metadata

← Metadata

Owner

Metadata

url-normalize
url-normalize copied to clipboard