flexmark-java icon indicating copy to clipboard operation
flexmark-java copied to clipboard

Typographic Extension silently dropping characters

Open garretwilson opened this issue 1 year ago • 3 comments

Using Flexmark 0.64.0 with Java 17 I'm expecting the Typographic Extension to turn ' into ’, but instead it seems merely to drop the character. The same happens e.g. with ---.

I'm setting up my parser and HTML renderer like this:

MutableDataHolder parserOptions = new MutableDataSet()
    //emoji; see https://www.webfx.com/tools/emoji-cheat-sheet/
    .set(EmojiExtension.USE_IMAGE_TYPE, EmojiImageType.UNICODE_ONLY)
    //GFM tables
    .set(TablesExtension.COLUMN_SPANS, false).set(TablesExtension.APPEND_MISSING_COLUMNS, true).set(TablesExtension.DISCARD_EXTRA_COLUMNS, true)
    .set(TablesExtension.HEADER_SEPARATOR_COLUMN_MATCH, true)
    //extensions
    .set(Parser.EXTENSIONS, List.of(DefinitionExtension.create(), EmojiExtension.create(), SuperscriptExtension.create(), TablesExtension.create(),
        TypographicExtension.create(), YamlFrontMatterExtension.create()));
parser = Parser.builder(parserOptions).build();
htmlRenderer = HtmlRenderer.builder().build();

Note that I just use TypographicExtension.create(). Maybe there are further configurations to do, but by default I wouldn't expect the extension just to drop characters.

I use the parser like this:

com.vladsch.flexmark.util.ast.Document markdownDocument = parser.parse("it's working");
System.out.println(htmlRenderer.render(markdownDocument));

I expect:

<p>it&rsquo;s working</p>

Instead I get:

<p>its working</p>

Why is the extension dropping the characters altogether? Or is the problem in the renderer, that needs some configuration to show the character references? Wherever the problem is, I would expect some notification instead of simply dropping the characters altogether. Silently deleting content is never welcome.

garretwilson avatar Jan 05 '23 15:01 garretwilson