linkify
linkify copied to clipboard
looseUrl option identifies text with multiple periods as a url
Issue:
Currently, the following patterns of text are being identified as url when looseUrl
option is true when using linkify
.
pattern1 -> 'awdaw....aw'
pattern2 -> 'awdaw...wad...wadw'
and so on...
Expected behaviour:
Technically, this shouldn't be identified as urls as there are multiple periods present consecutively and thus is an invalid url pattern.
I can track this issue to the looseUrlRegex
and the issue's arising from including .
at this point in regex which allows matching for multiple periods consecutively. Removing .
from this section resolves the issue.
[-a-zA-Z0-9@:%._\+~#=]{2,256}
Complete looseUrlRegex
r'''^(.*?)((https?:\/\/)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,4}\b([-a-zA-Z0-9@:%_\+.~#?&//="'`]*))'''