pdf-lib icon indicating copy to clipboard operation
pdf-lib copied to clipboard

Scrambled Text when pdfs are viewed in acrobat

Open NateANeurogazer opened this issue 3 years ago • 1 comments

What were you trying to do?

I made a pdf recently for a korean client using pdf-lib. Along the way I encountered an issue whereby Adobe Acrobat pro DC was scrambling the korean text in the document

When I open the file in Mac Preview, everything is fine, but when I use Adobe Acrobat Pro DC to open the file, the text itself is different:

How did you attempt to do it?

following guides online, most helpfully this one: https://community.adobe.com/t5/acrobat-discussions/scrambled-text-when-viewing-pdf-documents-in-acrobat-standard-xi/m-p/10651537/page/2#M152983

I double checked that I had the font i wanted installed correctly both on my system and in adobe acrobat pro DC. restarted my computer after installing the fonts recreated the pdf file still the error persisted. to double check I also swapped over to a different font for the korean text, repeated the installation, restart, and recreation steps, and achieved the same result.

What actually happened?

Adobe loads a sample sentence from my pdf as: image (copy paste from acrobat on that page returns yet another different string:  환성 기 을      성이     .)

whereas in mac preview or any other pdf renderer, the sentence correctly reads as: 순환성 기질을 가질 가능성이 높습니다.

as a reference, I made an english version of the file, swapping out for a set of custom english fonts that Adobe also didn't have, but it understood them just fine.

What did you expect to happen?

I expected adobe to be able to read my font choice, or at least not to scramble the text uselessly.

How can we reproduce the issue?

unfortunately, the file I made is for a client in the medical sector, so I can't share my whole code or the whole file

here's my font embed code and some example stuff: (github wont let me attach the specific font, since its an otf file) (this should work with any pdf you want to use to test it, just update the fetch)

import { PDFDocument } from 'pdf-lib'; import fontkit from '@pdf-lib/fontkit'; import download from "downloadjs";

async function modifyPdf(lang) { // Fetch custom font const gothicNeoBytes = await fetch('./vendors/document-fonts/AppleSDGothicNeoUL00-13.otf').then(res => res.arrayBuffer());

// Fetch an existing PDF document const existingPdfBytes = await fetch( example Pdf ).then(res => res.arrayBuffer());

// Load a PDFDocument from the existing PDF bytes const pdfDoc = await PDFDocument.load(existingPdfBytes);

// Register the fontkit instance pdfDoc.registerFontkit(fontkit);

// Embed our custom font in the document const gothicNeoFont = await pdfDoc.embedFont(gothicNeoBytes);

// Get the pages of the document const pagesArr = pdfDoc.getPages(); const examplePage = pagesArr[0];

examplePage.drawText('순환성 기질을 가질 가능성이 높습니다.', { x: width * 0.09, y: height * 0.49, size: 11.5, lineHeight: 14, maxWidth: 320, font: gothicNeoFont, })

const pdfBytes = await pdfDoc.save(); download(pdfBytes, example.pdf, "application/pdf"); }

Version

"@pdf-lib/fontkit": "^1.1.1",     "pdf-lib": "^1.17.1"

What environment are you running pdf-lib in?

Node

Checklist

  • [X] My report includes a Short, Self Contained, Correct (Compilable) Example.
  • [X] I have attached all PDFs, images, and other files needed to run my SSCCE.

Additional Notes

No response

NateANeurogazer avatar Feb 22 '22 20:02 NateANeurogazer

after some more tooling around with the code, i figured out that const gothicNeoFont = await pdfDoc.embedFont(gothicNeoBytes, {subset: true);

fixed my problems with the text being scrambled.

it temporarily caused a range error when trying to embed and write with english fonts for some reason, but I loaded it up today to get the error message from that in order to add it to this comment, and it magically solved itself over night.

NateANeurogazer avatar Feb 23 '22 14:02 NateANeurogazer