InlineParserExtensions are not invoked inside StrongEmphasis and other delimitor based types
Describe the bug We would like to implement a few specific colors in our own markdown based language like: [green] foo [/green]
We would also like it to work in combination with everything else like StrongEmphasis and links ie.
[Some partly [green] green [/green] text here](https://www.example.com)
** [green] foo [/green] **
But either InlineParserExtentions and delimitors do not compose both ways (what I gather from reading InlineParserImpl) or I am down the wrong track. I have included the parser code below. I can parse [green] **foo** [/green] just fine but not the ones listed above.
Please provide as much information about where the but is located or what you were using:
- [x]
Parser - [ ]
HtmlRenderer - [ ]
Formatter - [ ]
FlexmarkHtmlParser - [ ]
DocxRenderer - [ ]
PdfConverterExtension - [x]
extension(s)
Sample code It is in Kotlin but I think you should manage to get the idea.
override fun create(inlineParser: InlineParser): InlineParserExtension {
return object : InlineParserExtension {
override fun finalizeDocument(inlineParser: InlineParser) {
}
override fun finalizeBlock(inlineParser: InlineParser) {
}
val pattern = Pattern.compile("\\[green](.*)\\[/green]")
override fun parse(inlineParser: InlineParser): Boolean {
val match = inlineParser.match(pattern)
if (match != null) {
val startIndex = "[green]".length
val endIndex = match.length - "[/green]".length
val nodeChars = match.subSequence(startIndex, endIndex)
val node = GreenNode(nodeChars)
inlineParser.appendNode(node)
inlineParser.parse(nodeChars, node)
return true
} else {
return false
}
}
}
}
The issue is that your chosen syntax matches link reference elements. Link related elements have much more complex parsing mechanism due to all the possible variations of element syntax and limitations due to disallowed nesting of some elements.
In your case you have a link with the link text containing an embedded link ref. The standard parser generates the following AST:
Document[0, 98]
Paragraph[0, 98]
Link[0, 71] textOpen:[0, 1, "["] text:[1, 45, "Some partly [green] green [/green] text here"] textClose:[45, 46, "]"] linkOpen:[46, 47, "("] url:[47, 70, "https://www.example.com"] pageRef:[47, 70, "https://www.example.com"] linkClose:[70, 71, ")"]
Text[1, 45] chars:[1, 45, "Some … here"]
SoftLineBreak[71, 72]
Text[72, 75] chars:[72, 75, "** "]
LinkRef[75, 82] referenceOpen:[75, 76, "["] reference:[76, 81, "green"] referenceClose:[81, 82, "]"]
Text[76, 81] chars:[76, 81, "green"]
Text[82, 87] chars:[82, 87, " foo "]
LinkRef[87, 95] referenceOpen:[87, 88, "["] reference:[88, 94, "/green"] referenceClose:[94, 95, "]"]
Text[88, 94] chars:[88, 94, "/green"]
Text[95, 98] chars:[95, 98, " **"]
The easiest way to implement what you want is to use a post processor and replace the link ref nodes with custom nodes for your syntax elements.
There is a sample (TokenReplacingPostProcessorSample.java) that does this for link and image elements and replaces them with the text of the link/alt-text. In your case you will want to insert you custom emphasis element with the text between [color] and [/color] as child text node.
The post processor approach will have limitations since the markdown is already parsed and [link ref [green] green [/green]] will not parse the outer link ref since link refs within link refs are not allowed.
If you can live with these limitations then it is the easiest approach.
However, unless your syntax is externally dictated, it is better to choose a syntax for custom elements that does not conflict with basic markdown elements. For example if you used <green>...</green> instead of the square brackets then it would just be a custom inline HTML tag and easier to handle without conflicts.
Thanks for the response. We have a hacky solution not too different to your suggestions but was looking for a cleaner solution.
If you have time could you elaborate on why it is no longer possible to recursively parse inline content ? (or was it moved?) Seems quite useful regardless of the problems we had here.
The issue with your first example is that the emphasis ** delimiter is surrounded by spaces, which commonmark treats as plain text and not a delimiter.
In markdown and commonmark delimiters need to have a non-space on at least one side.
[Some partly [green] green [/green] text here](https://www.example.com)
**[green] foo [/green]**
will generate the AST you expect:
Document[0, 96]
Paragraph[0, 96]
Link[0, 71] textOpen:[0, 1, "["] text:[1, 45, "Some partly [green] green [/green] text here"] textClose:[45, 46, "]"] linkOpen:[46, 47, "("] url:[47, 70, "https://www.example.com"] pageRef:[47, 70, "https://www.example.com"] linkClose:[70, 71, ")"]
Text[1, 45] chars:[1, 45, "Some … here"]
SoftLineBreak[71, 72]
StrongEmphasis[72, 96] textOpen:[72, 74, "**"] text:[74, 94, "[green] foo [/green]"] textClose:[94, 96, "**"]
LinkRef[74, 81] referenceOpen:[74, 75, "["] reference:[75, 80, "green"] referenceClose:[80, 81, "]"]
Text[75, 80] chars:[75, 80, "green"]
Text[81, 86] chars:[81, 86, " foo "]
LinkRef[86, 94] referenceOpen:[86, 87, "["] reference:[87, 93, "/green"] referenceClose:[93, 94, "]"]
Text[87, 93] chars:[87, 93, "/green"]