Chinese character incorrectly encoded when copied from generated PDF using 'Adobe Source Han' serif/sans font | 内置PDF导出含思源宋/黑字体时有特定字符无法正确复制
-
[x] Searched existing issues to avoid creating duplicates.
-
[ ] Confirmed that it can be reproduced in built-in themes without customized css.
-
[x] Searched http://support.typora.io/
Describe the bug Some Chinese characters eg. '一'、'二' incorrectly encoded when copied from generated PDF using 'Adobe Source Han' serif/sans font.
To Reproduce Steps to reproduce the behavior:
- Modify some 3rd typora theme to specify the font to 'Adobe Source Han serif'/思源宋体 or 'Adobe Source Han sans'/思源黑体.
- Simply type in '一二三四五六七八九十'.
- Export as a pdf file.
- Copy the text in the pdf file, but the characters will change to '⼀⼆三四五六七⼋九⼗'. Note the characters '⼀⼆⼋⼗' are different from original inputs even they look similar.
Expected behavior Characters copied from the pdf should be exactly consistent with the original input.
Screenshots / Screencasts Omitted.
Sample Markdown File Omitted.
Desktop (please complete the following information):
- OS: Windows 11 64bit
Typora Version 1.8.10 in release channel.
Additional context
The issue is also discussed here but related to itext library or html2pdf program. https://www.v2ex.com/t/965813 (in Chinese) https://juejin.cn/post/6844903729439703053 (in Chinese)
This bug may be related to the upstream libraries of typora program. But, until this bug gets officially fixed, there are many ways to do a quick fix, for people who bothered.
- How to fix the pdf. It could be processed by the 'Pitstop Pro' plugin of Adobe Acrobat. Check 'Global Changes'->'Remap Font'.
- How to fix the copied characters. Personally, I prefer to the 'Unicode Normalizer' extension of VS Code. https://marketplace.visualstudio.com/items?itemName=espresso3389.unicode-normalizer
Typora uses Chromium's built-in PDF generation method, so, if you export the md to html, open it in Chrome and print as PDF, does the same bug occur?
Sure, you are correct. This bug is reproduced by the edge browser. At the same time, the built-in 'Save as pdf' function of firefox, and the virtual pdf printer of windows 'Print as PDF' works well without any issue.
After checking the chromium issue tracker, I found it's related to the bug 41469265 but marked as fixed. I will re-issue this to the chromium maintainers some days later.
It can be traced into the skia library, a very low level infra of chromium here. Perhaps it WONT BE FIX in a short time. Sad.