strictyaml icon indicating copy to clipboard operation
strictyaml copied to clipboard

URL validator does not accept fragments

Open cgrigis opened this issue 4 years ago • 5 comments

A URL with a fragment fails to validate:

>>> from strictyaml import Map, Url, load
>>> schema = Map({"url": Url()})
>>>
>>> load("url: https://example.com/bla", schema)
YAML({'url': 'https://example.com/bla'})
>>>
>>> load("url: https://example.com/bla#header", schema)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/cgrigis/C4DT/Projects/strictyaml/strictyaml/parser.py", line 318, in load
    return generic_load(yaml_string, schema=schema, label=label)
  File "/home/cgrigis/C4DT/Projects/strictyaml/strictyaml/parser.py", line 296, in generic_load
    return schema(YAMLChunk(document, label=label))
  File "/home/cgrigis/C4DT/Projects/strictyaml/strictyaml/validators.py", line 17, in __call__
    self.validate(chunk)
  File "/home/cgrigis/C4DT/Projects/strictyaml/strictyaml/compound.py", line 165, in validate
    value.process(self._validator_dict[yaml_key.scalar](value))
  File "/home/cgrigis/C4DT/Projects/strictyaml/strictyaml/scalar.py", line 27, in __call__
    return YAML(chunk, validator=self)
  File "/home/cgrigis/C4DT/Projects/strictyaml/strictyaml/representation.py", line 63, in __init__
    self._value = validator.validate(value)
  File "/home/cgrigis/C4DT/Projects/strictyaml/strictyaml/scalar.py", line 30, in validate
    return self.validate_scalar(chunk)
  File "/home/cgrigis/C4DT/Projects/strictyaml/strictyaml/scalar.py", line 128, in validate_scalar
    self._matching_message, "found non-matching string"
  File "/home/cgrigis/C4DT/Projects/strictyaml/strictyaml/yamllocation.py", line 47, in expecting_but_found
    self,
strictyaml.exceptions.YAMLValidationError: when expecting a url
found non-matching string
  in "<unicode string>", line 1, column 1:
    url: https://example.com/bla#header
     ^ (line: 1)

Probably related to 5beb1eaeb9fa2c23adc9a548a164e4d539f2acfc... it seems the new regex from https://urlregex.com/ does not cover fragments?

cgrigis avatar Mar 02 '21 16:03 cgrigis

@crdoconnor Any chance to have a quick look at this? Thank you!

cgrigis avatar May 31 '21 14:05 cgrigis

Yes, I'll take a look tonight.

On Mon, 31 May 2021, 17:08 cgrigis, @.***> wrote:

@crdoconnor https://github.com/crdoconnor Any chance to have a quick look at this? Thank you!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/141#issuecomment-851515958, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNLWSRPA4OVQBP26BQ3TQOJ6VANCNFSM4YPJQK7Q .

crdoconnor avatar May 31 '21 14:05 crdoconnor

Thanks! :)

cgrigis avatar May 31 '21 14:05 cgrigis

Hello @crdoconnor , any news on this?

I am not sure that https://urlregex.com/ is a reliable source for URL regexes. The one for Python does not handle fragments (as per this issue) nor tildes (as in https://example.com/~johndoe/index.html). Comparing it to the ones given for e.g. Perl or PHP, it is certainly not equivalent and more limiting.

Perhaps using https://pypi.org/project/validators/ (which also relies on a regex) or the urllib.parse module (with its limitations), as suggested here?

cgrigis avatar Feb 23 '22 15:02 cgrigis

Hi,

Sorry I havent been very active on this I've had a lot on recently.

That sounds like a good idea though.

On Wed, 23 Feb 2022, 15:29 cgrigis, @.***> wrote:

Hello @crdoconnor https://github.com/crdoconnor , any news on this?

I am not sure that https://urlregex.com/ is a reliable source for URL regexes. The one for Python does not handle fragments (as per this issue) nor tildes (as in https://example.com/~johndoe/index.html). Comparing it to the ones given for e.g. Perl or PHP, it is certainly not equivalent and more limiting.

Perhaps using https://pypi.org/project/validators/ (which also relies on a regex) or the urllib.parse module (with its limitations), as suggested here https://stackoverflow.com/questions/827557/how-do-you-validate-a-url-with-a-regular-expression-in-python ?

— Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/141#issuecomment-1048904265, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNIU4SMVTRSDMLPH3QTU4T4OBANCNFSM4YPJQK7Q . You are receiving this because you were mentioned.Message ID: @.***>

crdoconnor avatar Feb 24 '22 11:02 crdoconnor