Stirling-PDF Encoding bug during markdown to pdf conversion

Hello, Since version 21.0, I encounter encoding bugs when converting md to pdf.

Here's an example: This is a test with accents: é è à ç ù

Mar 17 '24 11:03 NicolasFR

Hello,

I've looked into the matter a bit, and I think it's weasyprint that doesn't "detect" encoding. When the -e utf-8 argument is passed to weasyprint, accented letters (é ù for example) are correctly displayed. But I don't understand why it worked before.

Apr 01 '24 14:04 NicolasFR

So from 20 to 21 we changed to use a different OS in docker Changing the app version that was installed

Apr 01 '24 14:04 Frooodle

So could be the latest weasyprint has some bug

Apr 01 '24 14:04 Frooodle

Yes I noticed the change from debian to alpine. I tried to use the same version of weasyprint (60.2) used in 0.21, but it didn't solve the problem. Also, I have failed to read the content of the temporary files genereated by stirling-pdf for weasyprint. Do you have an idea about how approach this issue ?

Apr 01 '24 16:04 NicolasFR

Might be worth posting on https://github.com/Kozea/WeasyPrint/

Apr 01 '24 17:04 Frooodle

I have created this issue https://github.com/Kozea/WeasyPrint/issues/2113

Apr 04 '24 14:04 NicolasFR

Hi, there,

I noticed that the HTML file provided as input to weasyprint does not contain any meta fields (charset for example) and the default encoding is latin9.

Would it be possible to force the use of utf-8 by passing the -e utf-8 argument to weasyprint? I can make a pull request

May 29 '24 19:05 NicolasFR

Happy for you to setup additional arrangements etc for weasy

May 29 '24 19:05 Frooodle

Happy for you to setup additional arrangements etc for weasy as params etc

Go ahead and raise seems a good change to be utf8

May 29 '24 19:05 Frooodle