flexmark-java icon indicating copy to clipboard operation
flexmark-java copied to clipboard

markdown to docx

Open say1or opened this issue 6 years ago • 5 comments

Now I have a requirement that I customize two markdown formats, which will be parsed as underlines and highlighted in the background. I want them to work as well as when converting into docx documents. What should I do?

say1or avatar Jul 02 '19 11:07 say1or

@rymm1, the docx renderer uses a similar extension mechanism as HTML renderer and Markdown formatter.

You need to create a custom node renderer implementing your custom elements by implementing CustomNodeDocxRenderer.java

You need to add DocxRendererExtension interface to your extension implementation. In your implementation of extend(Builder builder) of the interface add your custom docx node render by calling builder.nodeFormatterFactory(NodeDocxRendererFactory) passing your custom renderer factory.

The easiest way is to look at elements implemented by the CoreNodeDocxRenderer.java at elements which have similar rendering to your custom elements so you can tweak how they are rendered.

For example of underline you can use CoreNodeDocxRenderer.java: Lines 689-692

For setting background you can take a look at CoreNodeDocxRenderer.java: Lines 694-697 by following into the base class RunFormatProviderBase.java to see how these attributes are implemented for various elements.

vsch avatar Jul 02 '19 20:07 vsch

thank you for your reply. I've seen the code you recommended. how to create a delimited node work like StrongEmphasis because my custom markdown format is very similar to bolding image

say1or avatar Jul 03 '19 06:07 say1or

@rymm1, your custom docx node renderer should have a render for your node class instead of the StrongEmphasis node.

You can use the CoreNodeDocxRenderer for your custom node renderer reference. You only need to implement NodeDocxRenderer instead of PhasedNodeDocxRenderer interface. The latter has callbacks for various rendering phases which allows to examine the document before rendering and to pre and post render some content of the main document.

For node formatting handlers provide ones for your custom nodes and any other ones you want to override.

Other than docx specific interfaces the custom node renderer is the same as a custom HTML renderer or Formatter renderer you can see in most custom extensions.

The difficulty comes from having to understand DOCX format and the docx4j library but the core docx renderer has plenty of examples to help get started.

vsch avatar Jul 04 '19 19:07 vsch

By reading the source code,I have solved this problem. Now there's a new problem. Why does the markdown source require two line breaks '\n' to wrap lines in docx

This is the casewith one line breaks. image

image This is the case with two line breaks image image

say1or avatar Jul 08 '19 01:07 say1or

@rymm1, markdown by default treats a blank line as a paragraph separator. Which means that contiguous lines of text are treated as a single paragraph with the line breaks in the source ignored.

For HTML renderer there is an option to force hard breaks instead of soft breaks. In DOCX it always treats them as soft.

You also have the option of adding two or more spaces at the end of a text line to force a hard break. This is a CommonMark and I think Markdown standard.

vsch avatar Jul 09 '19 20:07 vsch