pdf-lib Convert in PDF/A

Describe your idea

Option for convert pdf in pdf/a

How could this be implemented?

A function

What problem are you trying to solve?

Converte pdf in pdf/a

Why does this matter to you?

.

Would others find this helpful?

Sim

Are you interested in implementing your proposal?

Yes

Why are you submitting a proposal?

Because you don't have it yet

Additional Notes

No response

Mar 03 '22 18:03 releandro15

PDF/A is already possible with pdf-lib. See #230 for reference. There is no built-in function, but it can easily be achieved following the comments.

Mar 07 '22 16:03 Simolation

The solution presented is very difficult. Is there a possibility of documentation that was straight to the point of how to do it?

Mar 08 '22 18:03 releandro15

?

Mar 28 '22 13:03 releandro15

For PDF/A Standard

Set Document ID

const documentId = crypto.randomBytes(16).toString('hex')
const id = PDFHexString.of(documentId)
pdfDoc.context.trailerInfo.ID = pdfDoc.context.obj([id, id]);

Add embedded font (check here)
Set Print Profile (check this PR #1512 )
Set Trim Box to page page.setTrimBox(0, 0, width, height)
Use correct color from print profile. If print profile is rgb then use rgb only in the whole document.
Add correct metadata (see example below)
Metadata must be the same data with document information

// set document information
const createDate = new Date();
pdfDoc.setTitle(title);
pdfDoc.setAuthor(author);
pdfDoc.setProducer(producer);
pdfDoc.setCreator(creator);
pdfDoc.setCreationDate(createDate);
pdfDoc.setModificationDate(createDate);
_addMetadataToDoc(createDate);

_addMetadata(pdfDoc, date, documentId, title, author, producer, creator) {
    const metadataXML = `
    <?xpacket begin="" id="${documentId}"?>
      <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.2-c001 63.139439, 2010/09/27-13:37:26        ">
        <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

          <rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/">
            <dc:format>application/pdf</dc:format>
            <dc:creator>
              <rdf:Seq>
                <rdf:li>${author}</rdf:li>
              </rdf:Seq>
            </dc:creator>
            <dc:title>
               <rdf:Alt>
                  <rdf:li xml:lang="x-default">${title}</rdf:li>
               </rdf:Alt>
            </dc:title>
          </rdf:Description>

          <rdf:Description rdf:about="" xmlns:xmp="http://ns.adobe.com/xap/1.0/">
            <xmp:CreatorTool>${creator}</xmp:CreatorTool>
            <xmp:CreateDate>${_formatDate(date)}</xmp:CreateDate>
            <xmp:ModifyDate>${_formatDate(date)}</xmp:ModifyDate>
            <xmp:MetadataDate>${_formatDate(date)}</xmp:MetadataDate>
          </rdf:Description>

          <rdf:Description rdf:about="" xmlns:pdf="http://ns.adobe.com/pdf/1.3/">
            <pdf:Producer>${producer}</pdf:Producer>
          </rdf:Description>

          <rdf:Description rdf:about="" xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/">
            <pdfaid:part>1</pdfaid:part>
            <pdfaid:conformance>B</pdfaid:conformance>
          </rdf:Description>
        </rdf:RDF>
      </x:xmpmeta>
    <?xpacket end="w"?>
    `.trim();

    const metadataStream = pdfDoc.context.stream(metadataXML, {
      Type: 'Metadata',
      Subtype: 'XML',
      Length: metadataXML.length,
    });
    const metadataStreamRef = pdfDoc.context.register(metadataStream);
    pdfDoc.catalog.set(PDFName.of('Metadata'), metadataStreamRef);
  }
  
  // remove millisecond from date
  _formatDate(date) {
    return date.toISOString().split('.')[0] + 'Z';
  }

Don't forget to to change PDF/A version number here

This is example for PDF/A-1B

 <pdfaid:part>1</pdfaid:part>
 <pdfaid:conformance>B</pdfaid:conformance>

For PDF/A-1B, it doesn't allow compression. So when you save document, disable useObjectStreams

pdfDoc.save({
   useObjectStreams: false,
})

If you would like to get metadata from existing document,

const metadata = pdfDoc.catalog.lookup(PDFName.of('Metadata'));
const textDecoder = new TextDecoder();
const text = textDecoder.decode(metadata.contents);
console.log(text);

Aug 19 '23 18:08 necessarylion

pdf-lib pdf-lib copied to clipboard

Convert in PDF/A

Describe your idea

How could this be implemented?

What problem are you trying to solve?

Why does this matter to you?

Would others find this helpful?

Are you interested in implementing your proposal?

Why are you submitting a proposal?

Additional Notes

pdf-lib
pdf-lib copied to clipboard