commonmark-java icon indicating copy to clipboard operation
commonmark-java copied to clipboard

Footnotes extension

Open robinst opened this issue 7 months ago • 1 comments

This adds a new extension commonmark-ext-footnotes (class org.commonmark.ext.footnotes.FootnotesExtension) to implement footnotes syntax as in GitHub Flavored Markdown (see docs). Fixes #273.

An example:

Some text with a footnote[^1].

[^1]: The text of the footnote.

The [^1] is parsed as a FootnoteReference, with 1 being the label. The line with [^1]: ... is a FootnoteDefinition, with the contents as child nodes (can be a paragraph like in the example, or other blocks like lists).

Apart from the parsing, the extension also comes with rendering of footnotes for HTML and Markdown.

Extension mechanisms

In order to implement this as a separate extension, the following APIs were added to commonmark core:

  • DefinitionMap: New class for storing and looking up definitions by a label, with label normalization as for link reference definitions
  • BlockParser: New method getDefinitions that can be implemented to return definitions that can later be accessed during inline parsing (the built-in ParagraphParser also uses that mechanism now; previously it was a special case in the parser)
  • LinkProcessor: New interface that can be implemented to customize link/image processing. This is used to turn [^1] into FootnoteReference nodes.
  • NodeRenderer: New methods beforeRoot and afterRoot that are called before/after rendering a document; used to render footnotes at the end of the document

Alternatives considered

PostProcessor

Could footnote reference parsing have been implemented as a PostProcessor step after inline parsing? No, because a foonote reference like [^*foo*] would have been turned into emphasis by inline parsing, whereas footnote parsing needs the raw *foo* as a label.

InlineContentParser

I considered using the recently-added inline parsing customization API, using [ as the trigger character. That would work for simple cases, but not for others. E.g. in this:

[^foo](/url)

[^foo]: note

That is not a footnote followed by (/url), but instead it's an inline link. In other words, if parsing as a link is possible, that is preferred.

That means our custom inline parser for [ would have to be able to parse the full link syntax in order to give preference to links, which is quite tricky. In addition to that, it would have have to trigger on !, for a footnote like ![^foo], which normally would be parsed as an image node.

So that's what LinkProcessor solves: It keeps the tricky link parsing in the inline parser, but allows extensions to decide to treat certain things not as links, but different types of nodes, or maybe even parse things that come after a link (e.g. image attributes could be implemented on top of this).

robinst avatar Jul 07 '24 13:07 robinst