invidious icon indicating copy to clipboard operation
invidious copied to clipboard

Exclude non-url characters in text auto links

Open mattfbacon opened this issue 6 months ago • 3 comments

I sometimes see that in cases like (visit my website at https://example.com) the closing parenthesis gets included in the link. Markdown autolinks don't have this issue, and I think the algorithm is popular enough to have a sort of intuitive understandability for most people.

This is the code in question:

https://github.com/iv-org/invidious/blob/81ca8314396524e9a51901a70dfb86b99d6c7cf6/src/invidious/comments/content.cr#L12-L27

mattfbacon avatar May 13 '25 20:05 mattfbacon

Yeah it is not really a duplicate, we don't need full markdown, the suggestion here ist just to include less characters in links sometimes.

mattfbacon avatar May 14 '25 06:05 mattfbacon

For reference here's what the commonmark specs say:

https://spec.commonmark.org/0.31.2/#autolink

A URI autolink consists of <, followed by an absolute URI followed by >. It is parsed as a link to the URI, with the URI as the link’s label.

An absolute URI, for these purposes, consists of a scheme followed by a colon (:) followed by zero or more characters other than ASCII control characters, space, <, and >. If the URI includes these characters, they must be percent-encoded (e.g. %20 for a space).

For purposes of this spec, a scheme is any sequence of 2–32 characters beginning with an ASCII letter and followed by any combination of ASCII letters, digits, or the symbols plus (“+”), period (“.”), or hyphen (“-”).

All we need to do to implement this should be to just make the regex less permissive.

I don't think we need to follow the markdown specifications really but just ensure that we're properly excluding non-url characters.

syeopite avatar May 15 '25 16:05 syeopite

I was not clear, I am not referring to the links in angle brackets but rather automatically detected links, such as described here: https://github.com/mattcone/markdown-guide/blob/master/_extended-syntax/automatic-url-linking.md. I would expect the rules to be a bit different. For example, I wrote a period after this link that I just pasted her and it was smartly not included, but that does not mean that period characters are not valid in the URL, as shown by the period in .md.

mattfbacon avatar May 15 '25 21:05 mattfbacon