linkify icon indicating copy to clipboard operation
linkify copied to clipboard

looseUrl option identifies text with multiple periods as a url

Open rutvik110 opened this issue 1 year ago • 1 comments

Issue:

Currently, the following patterns of text are being identified as url when looseUrl option is true when using linkify.

pattern1 -> 'awdaw....aw'
pattern2 -> 'awdaw...wad...wadw'
and so on...

Expected behaviour:

Technically, this shouldn't be identified as urls as there are multiple periods present consecutively and thus is an invalid url pattern.

rutvik110 avatar Jan 14 '24 11:01 rutvik110

I can track this issue to the looseUrlRegex and the issue's arising from including . at this point in regex which allows matching for multiple periods consecutively. Removing . from this section resolves the issue.

[-a-zA-Z0-9@:%._\+~#=]{2,256}                     

Complete looseUrlRegex

r'''^(.*?)((https?:\/\/)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,4}\b([-a-zA-Z0-9@:%_\+.~#?&//="'`]*))'''

rutvik110 avatar Jan 14 '24 11:01 rutvik110