lychee icon indicating copy to clipboard operation
lychee copied to clipboard

Non-link footnotes in markdown are reported as broken links.

Open jan-ferdinand opened this issue 10 months ago • 9 comments

Consider the following file /tmp/file.md:

Some[^1] text.

[^1]: short

Running lychee . produces the following error

[./file.md]:
✗ [ERR] file:///tmp/short | Failed: Cannot find file

To the best of my knowledge, short is never a link but always a footnote. It would be nice for these presumed false positives to not occur.

jan-ferdinand avatar Apr 22 '24 18:04 jan-ferdinand

Oh wow, thanks for reporting. Definitely a bug. If you have the time, you could add a (failing) unit test here: https://github.com/lycheeverse/lychee/blob/master/lychee-lib/src/extract/markdown.rs You can use your example Markdown document for the test. The bug is somewhere in https://github.com/lycheeverse/lychee/blob/f2b1c29bd4e850b57dfa514d89cf4c3df9510dea/lychee-lib/src/extract/markdown.rs#L11

mre avatar Apr 23 '24 09:04 mre

I've added a test in #1410. Unfortunately, I can't figure out where extraction of markdown links is happening.

jan-ferdinand avatar Apr 23 '24 12:04 jan-ferdinand

Thanks. :)

mre avatar Apr 26 '24 10:04 mre

You're more than welcome. :blush: I'd have liked to give resolution a shot, but couldn't identify where to start. It's probably somewhere in the parser? If so, that's probably more than I can chew right now.

jan-ferdinand avatar Apr 26 '24 14:04 jan-ferdinand

Feel free to dive in. But yeah, it's in the markdown parser in pulldown_cmark, which is the crate we're using for it.

mre avatar Apr 26 '24 18:04 mre

I had a closer look at this. Actually this is not a bug in lychee. We use pulldown_cmark to parse Markdown thus we are treating all Markdown as CommonMark. CommonMark does not know specify footnotes, so there really are no footnotes. Instead, there are shortcut reference links and the "footnote" in your example is treated as a shortcut reference links.

So the example you provide:

Some[^1] text.

[^1]: short

is understood as shortcut reference link and therefore converted into the following HTML:

<p>Some<a href="short">^1</a> text.</p>

When not using a valid (relative) link it is not a shortcut reference link and the text is simply understood as normal paragraphs:

Some[^1] text.

[^1]: multiple words

now becomes:

<p>Some[^1] text.</p>
<p>[^1]: multiple words</p>

So footnotes are neither part of CommonMark nor GitHub Flavored Markdown (the only Markdown specifications I know of) but still some people might be using them because many non-specified flavours do make use of them. (the beauty of the Markdown flavour swamp)

So one thing we could do is to treat the link of these shortcut type links not as URL but as plain text (extract_plaintext) where we extract the URLs from. This would reduce the false positive rate when people are checking Markdown which is not CommonMark compliant, which is probably the big majority. @mre what do you think?

thomas-zahner avatar Jun 14 '24 10:06 thomas-zahner

Interesting! Thanks for digging and explaining what's going on.

footnotes are [not part of] GitHub Flavored Markdown […]

I disagree: :relieved:

You can add footnotes to your content by using this bracket syntax:

Here is a simple footnote[^1].

[^1]: My reference.

Edit: Quote linked documentation directly.

jan-ferdinand avatar Jun 14 '24 10:06 jan-ferdinand

No problem :+1:

I disagree 😌

Wait... I stated that because I could not find anything related to footnotes in their official spec but in the documentation link you sent they go on explaining how to use footnotes. So I guess that they are not even adhering to their own spec or are referencing some other falvour there? :exploding_head:

thomas-zahner avatar Jun 14 '24 10:06 thomas-zahner

The markdown swamp gets swampier the further you go. :cloud_with_rain: The strongest indication of them going against their own spec that I can see is: this^1 works.

jan-ferdinand avatar Jun 14 '24 11:06 jan-ferdinand

The linked PR has a proposed solution now. @jan-ferdinand fyi.

mre avatar Aug 06 '24 15:08 mre

This is now fixed in master and will be released with the next version. Thanks @jan-ferdinand for the test and for opening the issue.

mre avatar Aug 11 '24 10:08 mre