dvisvgm icon indicating copy to clipboard operation
dvisvgm copied to clipboard

Missing text from PDF >> SVG

Open povpie opened this issue 1 year ago • 6 comments

Version:3.12 Configuration: dvisvgm --pdf -Oall -fwoff2 Problem: missing some text on svg.

File: test1.pdf

(additional files)
test2.pdf test3.pdf

povpie avatar Oct 25 '23 22:10 povpie

I can't reproduce the problem. What version of dvisvgm and mutool do you use? Please post the output of dvisvgm -V1.

mgieseki avatar Oct 26 '23 08:10 mgieseki

dvisvgm 3.1.2 (x86_64-pc-win64)

brotli: 1.1.0 clipper: 6.2.1 freetype: 2.13.2 Ghostscript: 9.25 MiKTeX: 22.12 mutool: 1.21.0 potrace: 1.16 xxhash: 0.8.2 zlib: 1.3

Just realized that many are outdated. I'll try to update them.

I updaded mutool successfully but ghoscript and miktex are still showing as the old version when i run dvisvgm -V1 , I already changed the environmental variable path to the correct folders. Am I missing something? Thanks Martin.

povpie avatar Oct 26 '23 18:10 povpie

Ok, thanks for the additional info. I was able to reproduce the issue now. Unfortunately, it's related to the limited functionality available via mutool. The PDF file contains four different font resources that all have the same internal name PCPYGD+-:

Fonts (4):
        1       (7 0 R):        Type0 'PCPYGD+-' Identity-H (11 0 R)
        1       (7 0 R):        Type0 'PCPYGD+-' Identity-H (12 0 R)
        1       (7 0 R):        Type1 'PCPYGD+-' WinAnsiEncoding (8 0 R)
        1       (7 0 R):        Type1 'PCPYGD+-' WinAnsiEncoding (13 0 R)

Therefore, it's not possible to identify the different fonts by their name which is essential for dvisvgm to work properly. Maybe you can tweak the font embedding options of the application used to create the PDF files in order to get more distinct names when subsetting fonts.

mgieseki avatar Oct 26 '23 21:10 mgieseki

I'll try to change it manually. Is is that mutool doesn't identify the ID following the name font? I found this link (looks like the same issue): https://github.com/pymupdf/pymupdf/issues/2110#issuecomment-1343318360

Here are the fonts identified on test1 with an online font downloader: Screenshot (2)

povpie avatar Oct 26 '23 22:10 povpie

Is is that mutool doesn't identify the ID following the name font?

In the PDF file, there are no numbers appended to the font names. They are probably added by your font downloader. As shown above, all four font objects got the name PCPYGD+-. Internally they can be distinguished by their object IDs but mutool doesn't provide a way to make them accessible to the user in the backend. Fonts are referenced there only by their names which might be ambiguous, like in your case.

mgieseki avatar Oct 27 '23 08:10 mgieseki

Ah, i see. If I find a solution I'll post it here. Unfortunately there's no option to export the design in different way for fonts on Adobe Express.

povpie avatar Oct 27 '23 10:10 povpie