QuartoNotebookRunner.jl
QuartoNotebookRunner.jl copied to clipboard
Problem with multibyte characters
I am testing a quarto project of 'book' type with the new julia engine. Three times in about 50 pdf output pages it happened that a german umlaut ä,ü gets replaced by '��' (two copies of the unicode replacement character U+fffd).
How to reproduce:
Run quarto render test3.qmd --to html
with the attached file.
- It doesn't happen with jupter/IJulia.
- It doesn't happen with
QuartoNotebookRunner.run!(server, "test3.qmd"; output = "test3.ipynb")
- Running quarto with
keep-md:true
, the '��' are already in thetest3.html.md
file which is fed to pandoc.
My guess is that somewhere a string/buffer is cut in the middle of an utf8 multibyte character and then both parts are themselves interpreted as utf8 strings. So the invalid utf8 bytes get replaced by U+fffd.