QuartoNotebookRunner.jl icon indicating copy to clipboard operation
QuartoNotebookRunner.jl copied to clipboard

Problem with multibyte characters

Open MHellmund opened this issue 9 months ago • 6 comments

I am testing a quarto project of 'book' type with the new julia engine. Three times in about 50 pdf output pages it happened that a german umlaut ä,ü gets replaced by '��' (two copies of the unicode replacement character U+fffd).

How to reproduce: Run quarto render test3.qmd --to html with the attached file.

  • It doesn't happen with jupter/IJulia.
  • It doesn't happen with QuartoNotebookRunner.run!(server, "test3.qmd"; output = "test3.ipynb")
  • Running quarto with keep-md:true, the '��' are already in the test3.html.md file which is fed to pandoc.

My guess is that somewhere a string/buffer is cut in the middle of an utf8 multibyte character and then both parts are themselves interpreted as utf8 strings. So the invalid utf8 bytes get replaced by U+fffd.

test3.qmd.txt test3.html.md.txt

MHellmund avatar May 14 '24 10:05 MHellmund