pdf-lib
pdf-lib copied to clipboard
Convert in PDF/A
Describe your idea
Option for convert pdf in pdf/a
How could this be implemented?
A function
What problem are you trying to solve?
Converte pdf in pdf/a
Why does this matter to you?
.
Would others find this helpful?
Sim
Are you interested in implementing your proposal?
Yes
Why are you submitting a proposal?
Because you don't have it yet
Additional Notes
No response
PDF/A is already possible with pdf-lib. See #230 for reference. There is no built-in function, but it can easily be achieved following the comments.
The solution presented is very difficult. Is there a possibility of documentation that was straight to the point of how to do it?
?
For PDF/A Standard
- Set Document ID
const documentId = crypto.randomBytes(16).toString('hex')
const id = PDFHexString.of(documentId)
pdfDoc.context.trailerInfo.ID = pdfDoc.context.obj([id, id]);
- Add embedded font (check here)
- Set Print Profile (check this PR #1512 )
- Set Trim Box to page
page.setTrimBox(0, 0, width, height)
- Use correct color from print profile. If print profile is rgb then use rgb only in the whole document.
- Add correct metadata (see example below)
- Metadata must be the same data with document information
// set document information
const createDate = new Date();
pdfDoc.setTitle(title);
pdfDoc.setAuthor(author);
pdfDoc.setProducer(producer);
pdfDoc.setCreator(creator);
pdfDoc.setCreationDate(createDate);
pdfDoc.setModificationDate(createDate);
_addMetadataToDoc(createDate);
_addMetadata(pdfDoc, date, documentId, title, author, producer, creator) {
const metadataXML = `
<?xpacket begin="" id="${documentId}"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.2-c001 63.139439, 2010/09/27-13:37:26 ">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:format>application/pdf</dc:format>
<dc:creator>
<rdf:Seq>
<rdf:li>${author}</rdf:li>
</rdf:Seq>
</dc:creator>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">${title}</rdf:li>
</rdf:Alt>
</dc:title>
</rdf:Description>
<rdf:Description rdf:about="" xmlns:xmp="http://ns.adobe.com/xap/1.0/">
<xmp:CreatorTool>${creator}</xmp:CreatorTool>
<xmp:CreateDate>${_formatDate(date)}</xmp:CreateDate>
<xmp:ModifyDate>${_formatDate(date)}</xmp:ModifyDate>
<xmp:MetadataDate>${_formatDate(date)}</xmp:MetadataDate>
</rdf:Description>
<rdf:Description rdf:about="" xmlns:pdf="http://ns.adobe.com/pdf/1.3/">
<pdf:Producer>${producer}</pdf:Producer>
</rdf:Description>
<rdf:Description rdf:about="" xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/">
<pdfaid:part>1</pdfaid:part>
<pdfaid:conformance>B</pdfaid:conformance>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
`.trim();
const metadataStream = pdfDoc.context.stream(metadataXML, {
Type: 'Metadata',
Subtype: 'XML',
Length: metadataXML.length,
});
const metadataStreamRef = pdfDoc.context.register(metadataStream);
pdfDoc.catalog.set(PDFName.of('Metadata'), metadataStreamRef);
}
// remove millisecond from date
_formatDate(date) {
return date.toISOString().split('.')[0] + 'Z';
}
Don't forget to to change PDF/A version number here
This is example for PDF/A-1B
<pdfaid:part>1</pdfaid:part>
<pdfaid:conformance>B</pdfaid:conformance>
For PDF/A-1B
, it doesn't allow compression. So when you save document, disable useObjectStreams
pdfDoc.save({
useObjectStreams: false,
})
If you would like to get metadata from existing document,
const metadata = pdfDoc.catalog.lookup(PDFName.of('Metadata'));
const textDecoder = new TextDecoder();
const text = textDecoder.decode(metadata.contents);
console.log(text);