lychee
lychee copied to clipboard
Non-link footnotes in markdown are reported as broken links.
Consider the following file /tmp/file.md
:
Some[^1] text.
[^1]: short
Running lychee .
produces the following error
[./file.md]:
✗ [ERR] file:///tmp/short | Failed: Cannot find file
To the best of my knowledge, short
is never a link but always a footnote. It would be nice for these presumed false positives to not occur.
Oh wow, thanks for reporting. Definitely a bug. If you have the time, you could add a (failing) unit test here: https://github.com/lycheeverse/lychee/blob/master/lychee-lib/src/extract/markdown.rs You can use your example Markdown document for the test. The bug is somewhere in https://github.com/lycheeverse/lychee/blob/f2b1c29bd4e850b57dfa514d89cf4c3df9510dea/lychee-lib/src/extract/markdown.rs#L11
I've added a test in #1410. Unfortunately, I can't figure out where extraction of markdown links is happening.
Thanks. :)
You're more than welcome. :blush: I'd have liked to give resolution a shot, but couldn't identify where to start. It's probably somewhere in the parser? If so, that's probably more than I can chew right now.
Feel free to dive in. But yeah, it's in the markdown parser in pulldown_cmark, which is the crate we're using for it.
I had a closer look at this. Actually this is not a bug in lychee.
We use pulldown_cmark
to parse Markdown thus we are treating all Markdown as CommonMark. CommonMark does not know specify footnotes, so there really are no footnotes. Instead, there are shortcut reference links and the "footnote" in your example is treated as a shortcut reference links.
So the example you provide:
Some[^1] text.
[^1]: short
is understood as shortcut reference link and therefore converted into the following HTML:
<p>Some<a href="short">^1</a> text.</p>
When not using a valid (relative) link it is not a shortcut reference link and the text is simply understood as normal paragraphs:
Some[^1] text.
[^1]: multiple words
<p>Some[^1] text.</p>
<p>[^1]: multiple words</p>
So footnotes are neither part of CommonMark nor GitHub Flavored Markdown (the only Markdown specifications I know of) but still some people might be using them because many non-specified flavours do make use of them. (the beauty of the Markdown flavour swamp)
So one thing we could do is to treat the link of these shortcut type links not as URL but as plain text (extract_plaintext
) where we extract the URLs from. This would reduce the false positive rate when people are checking Markdown which is not CommonMark compliant, which is probably the big majority. @mre what do you think?
Interesting! Thanks for digging and explaining what's going on.
footnotes are [not part of] GitHub Flavored Markdown […]
I disagree: :relieved:
You can add footnotes to your content by using this bracket syntax:
Here is a simple footnote[^1].
[^1]: My reference.
Edit: Quote linked documentation directly.
No problem :+1:
I disagree 😌
Wait... I stated that because I could not find anything related to footnotes in their official spec but in the documentation link you sent they go on explaining how to use footnotes. So I guess that they are not even adhering to their own spec or are referencing some other falvour there? :exploding_head:
The markdown swamp gets swampier the further you go. :cloud_with_rain: The strongest indication of them going against their own spec that I can see is: this^1 works.
The linked PR has a proposed solution now. @jan-ferdinand fyi.
This is now fixed in master
and will be released with the next version.
Thanks @jan-ferdinand for the test and for opening the issue.