Valid url do not fit trafaret.URL
Hi @Deepwalker . I found the next validation error for working url 'https://www.dior.com/fr_fr/maquillage/adoptez-le-look-du-defile-croisiere\xa02020'
Is it expected behavior? What do you think?
(3_7_2) MacBook-Pro-2:test fedir$ python -c "import trafaret as t; t.URL.check('https://www.dior.com/fr_fr/maquillage/adoptez-le-look-du-defile-croisiere\xa02020')"
Traceback (most recent call last):
File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 166, in transform
return self.trafaret(value, context=context)
File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 156, in __call__
return self.check(val, context=context)
File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 118, in check
return self.transform(value, context=context)
File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 286, in transform
raise DataError(dict(enumerate(errors)), trafaret=self)
trafaret.dataerror.DataError: {0: DataError(does not match pattern ^(?:http|ftp)s?://(?:\S+(?::\S*)?@)?(?:(?:[A-Z0-9](?:[A-Z0-9-_]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|localhost|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(?::\d+)?(?:/?|[/?]\S+)$), 1: DataError(does not match pattern ^(?:http|ftp)s?://(?:\S+(?::\S*)?@)?(?:(?:[A-Z0-9](?:[A-Z0-9-_]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|localhost|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(?::\d+)?(?:/?|[/?]\S+)$)}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 118, in check
return self.transform(value, context=context)
File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 168, in transform
raise DataError(self.message, value=value)
trafaret.dataerror.DataError: value is not URL
Hi! Can you check if this link works with this pr? https://github.com/Deepwalker/trafaret/pull/36
Actually I'm not sure about right behavior. It's can be that link is actually incorrect. Will need to reread rfc
Proof of URL correctness is hard.
I suspect that regex-based solution is supposed to provide false positives by design :(
Even much more complicated yarl is not free from such things.
Well, yarl.URL() works pretty good but yarl.URL.build() cannot parse valid args now :(
@asvetlov
but yarl.URL.build() cannot parse valid args now
What are you mean ? It seems, everything works:
fedor@ubuntu:~$ python -c "import yarl; print(yarl.URL.build(host='example.com', scheme='https', path='/path\xa0abc'))"
https://example.com/path%C2%A0abc
fedor@ubuntu:~$ python -V
Python 3.7.3
fedor@ubuntu:~$ pip list | grep yarl
yarl 1.3.0