pandoc
pandoc copied to clipboard
Custom style for inline code blocks
I am trying to use the Custom Styles feature with pandoc 2.9.2 to produce a .docx document that matches a pre-existing set of styles. I supplied a --reference-doc containing those styles, but some of them (notably inline code blocks and inline code blocks in headers) are not matching correctly.
I attempted doing this in my markdown:
### Test header with some [`code`]{custom-style="CustomHeadingCode"}
[Lorem ipsum dolor sit amet]{custom-style="CustomStyle0"},
consectetur adipiscing elit. Suspendisse non nisi ex. Ut non
sollicitudin est. Hello [`world`]{custom-style="CustomCode"}.
When outputting to Word, I can see CustomStyle0 being applied correctly, but both CustomHeadingCode and CustomCode seem to have been ignored, as the resulting style is the default source code one. I also attempted using --no-highlight in case syntax highlighting was conflicting with this feature.
How can I get Pandoc to apply my custom styles to inline code blocks and header elements?
I am also experiencing this same issue, but when converting from HTML to DocX - using the following HTML as an example:
<pre>
<span custom-style="Red Text">Red Text</span>
</pre>
I have created my "Red Text" Character style within my reference document and this works fine for text outside of code blocks, but when I check the DocX it only applies the "Source Code" style to my code block and ignores the "Red Text" style inside the block. This issue was raised 5 years ago and I am now trying this on Pandoc 3.7. If I go into Word and manually apply the "Red Text" style to the text then it applies. Does anyone know how i can get this to work with Pandoc?
@vittorioromeo -- there are a few things to unpack here. In the Markdown snippet
### Test header with some [`code`]{custom-style="CustomHeadingCode"}
you are creating a Span whose only content is a single Code element. The Code element doesn't have a style applied, but the Span does. You can verify this by viewing the corresponding AST via pandoc test.md -t native:
[ Header
3
( "test-header-with-some-code" , [] , [] )
[ Str "Test"
, Space
, Str "header"
, Space
, Str "with"
, Space
, Str "some"
, Space
, Span
( "" , [] , [ ( "custom-style" , "CustomHeadingCode" ) ] )
[ Code ( "" , [] , [] ) "code" ]
]
]
If you want to apply an attribute to the Code element directly (rather than wrap the Code in a Span, you would have to write
### Test header with some `code`{custom-style="CustomHeadingCode"}
(See inline_code_attributes in the manual.) Then we get
[ Header
3
( "test-header-with-some-code" , [] , [] )
[ Str "Test"
, Space
, Str "header"
, Space
, Str "with"
, Space
, Str "some"
, Space
, Code
( "" , [] , [ ( "custom-style" , "CustomHeadingCode" ) ] )
"code"
]
]
This, however, still won't apply your custom style in DOCX output because the DOCX writer only considers custom styles on Spans, Divs, and Tables (character styles for Spans, and paragraph styles for Divs, and table styles for Tables). After all, the only property that identifies a Code element in DOCX output as Code is the application of the VerbatimChar character style -- if that style isn't used anymore, what makes it a Code element? Therefore, to apply your custom style, just use a Span:
### Test header with some [code]{custom-style="CustomHeadingCode"}
I suppose the behaviour of the DOCX writer could be changed so that defining a custom style on an inline or block element would take precedence over the "natural" style that would be used otherwise. The advantage of that behaviour compared to the current one is that the semantics of the element would be preserved when converting to formats other than DOCX (i.e. for DOCX a Code element would use a custom style, but when converting to e.g. HTML the element would still be a Code rather than a Span with attributes).
The disadvantage of this behaviour would be that round trip conversions to DOCX would no longer work, as what was previously a Code element would turn into a Span when pandoc reads the created DOCX.
@jgm, any thoughts on this? If we stick with the current behaviour this issue can be closed.
How would that work if using native HTML CSS attributes? Take the following for example:
<pre><span custom-style="Red Text">Red Text</span></pre>
<span custom-style="Red Text">Red Text</span>
If I look at the native output then we get this:
[ CodeBlock ( "" , [] , [] ) "Red Text"
, Para
[ Span
( "" , [] , [ ( "custom-style" , "Red Text" ) ] )
[ Str "Red" , Space , Str "Text" ]
]
]
The CodeBlock doesn't have any reference to the custom style?
If I look at the XML within document.xml then I can see it has used:
<w:p><w:pPr><w:pStyle w:val="SourceCode" /></w:pPr><w:r><w:rPr><w:rStyle w:val="VerbatimChar" /></w:rPr><w:t xml:space="preserve">Red Text</w:t></w:r></w:p>
If I change VerbatimChar to my custom style of RedText and zip it back up again, then it has the desired result, but how do I get this behaviour using Pandoc?
I have noticed that if I specify syntax highlighting as a pre class (i.e. "bash") then Pandoc automatically updates those inner styles to style the contents of a code block.
@cmason3 -- In pandoc's AST, CodeBlocks only contain attributes and the actual code text (see here). The CodeBlock AST element does not permit arbitrary inline content, only plain text, which is why your <span> is lost during conversion. Naturally, information that is not present in pandoc's AST will not be available to the DOCX writer. Therefore, what you are trying to do (preserving syntax highlighting colours from HTML when converting to DOCX) cannot be done.
However, pandoc can perform its own syntax highlighting. If you want your DOCX code blocks to include syntax highlighting when converting from HTML to DOCX, make sure that the <pre> element in your HTML has a class with the name of the language of the code block (e.g. <pre class="bash">). Then your DOCX should include syntax highlighting.
For questions about pandoc the discussion forum is the best place.
Ok.. understood... is it possible to write my own custom syntax highlighting parser as opposed to using one of the built in ones? I am currently contemplating hacking the resulting XML, which I don't want to do.
All I want to do is be able to style code within my blocks as either red text, blue text, green text or highlight the text with a yellow background.
@cmason3 -- you could try using a filter. Please post any further questions in the discussion forum.