pdf-lib icon indicating copy to clipboard operation
pdf-lib copied to clipboard

findPageForAnnotationRef doesn't return page for some fields of AcroForm present on the page

Open arthabus opened this issue 2 years ago • 3 comments

What were you trying to do?

I'm trying to locate all the fields of an on Acroform

How did you attempt to do it?

in node I use the following function:

let pdfBase64 = //base 64 representation of my pdf is attached in the link

	let pdfDataBuffer =  Buffer.from(pdfBase64, 'base64');
	let pdfDoc = await PDFDocument.load(pdfDataBuffer)
	let form = pdfDoc.getForm();
	let formFields = form.getFields();
	let pages = pdfDoc.getPages()

	let elementsMap = {}
	for (let field of formFields) {
		console.log("===")
		console.log("")
		let page = pdfDoc.findPageForAnnotationRef(field.ref);
		const type = field.constructor.name
		const name = field.getName()
		let pageNum = -1
		if(page){
			pageNum = pages.findIndex(pageThis => pageThis.ref.tag === page.ref.tag)
			console.log("page w/h = ", page.getWidth(), page.getHeight(), ", pageNum = ", pageNum)
		} else {
			console.log("pages.annotations missing field.ref.tag = ", field.ref.tag, ", field = ", field)
		}
		console.log(`convertFormFieldsToElements: fieldName = ${type}: ${name}, pageNum = ${pageNum}, field.ref = ${field.ref}`)
		let widgets = field.acroField.getWidgets()
		for (let i = 0; i < widgets.length; ++i) {
			let widget = widgets[i];
			let rect = widget.Rect().asRectangle()
			let elementsArray = elementsMap[pageNum] || []
			elementsArray.push({type, name, rect})
			elementsMap[pageNum] = elementsArray
			console.log("convertFormFieldsToElements: widget = ", widget, ", rect = ", rect)
		}

	}
	console.log("convertFormFieldsToElements: elementsMap = ", elementsMap)

What actually happened?

pdfDoc.findPageForAnnotationRef(field.ref) - doesn't return page for some fields.

Comparing field.ref.tag of the problematic fields and all the tags of the pages, there is indeed no match, which makes me think something is wrong with the way the tags are assigned.

What did you expect to happen?

pdfDoc.findPageForAnnotationRef(field.ref) returns the corresponding page for all the 13 acro fields

How can we reproduce the issue?

Link to the test pdf Link to the test pdf in base64

Run the below code in node:

let pdfBase64 = //base 64 representation of my pdf is attached in [the link](https://drive.google.com/file/d/1G6sHuYXb2QU1L-rJHwKtSOboMVWCuK47/view?usp=drive_link)

function convertFormFieldsToElements(pdfBase64){

	let pdfDataBuffer =  Buffer.from(pdfBase64, 'base64');
	let pdfDoc = await PDFDocument.load(pdfDataBuffer)
	let form = pdfDoc.getForm();
	let formFields = form.getFields();
	let pages = pdfDoc.getPages()

	let elementsMap = {}
	for (let field of formFields) {
		console.log("===")
		console.log("")
		let page = pdfDoc.findPageForAnnotationRef(field.ref);
		const type = field.constructor.name
		const name = field.getName()
		let pageNum = -1
		if(page){
			pageNum = pages.findIndex(pageThis => pageThis.ref.tag === page.ref.tag)
			console.log("page w/h = ", page.getWidth(), page.getHeight(), ", pageNum = ", pageNum)
		} else {
			console.log("pages.annotations missing field.ref.tag = ", field.ref.tag, ", field = ", field)
		}
		console.log(`convertFormFieldsToElements: fieldName = ${type}: ${name}, pageNum = ${pageNum}, field.ref = ${field.ref}`)
		let widgets = field.acroField.getWidgets()
		for (let i = 0; i < widgets.length; ++i) {
			let widget = widgets[i];
			let rect = widget.Rect().asRectangle()
			let elementsArray = elementsMap[pageNum] || []
			elementsArray.push({type, name, rect})
			elementsMap[pageNum] = elementsArray
			console.log("convertFormFieldsToElements: widget = ", widget, ", rect = ", rect)
		}

	}
	console.log("convertFormFieldsToElements: elementsMap = ", elementsMap)
}

Version

1.17.1

What environment are you running pdf-lib in?

Node

Checklist

  • [X] My report includes a Short, Self Contained, Correct (Compilable) Example.
  • [X] I have attached all PDFs, images, and other files needed to run my SSCCE.

Additional Notes

No response

arthabus avatar Jul 19 '23 10:07 arthabus

I have the same problem, in my case, I could find the page with pages.find((x) => x.ref === widget.P()). I ended up doing

const page = doc.findPageForAnnotationRef(field.ref) ?? pages.find((x) => x.ref === widget.P());

jean343 avatar Aug 18 '24 18:08 jean343

I am having the same problem. widget.P() is also returning undefined:

for (const pdfField of pdfDoc.getForm().getFields()) {
  for (const widget of pdfField.acroField.getWidgets()) {
    const p = widget.P()
    if (p === undefined) {
      // This is always true
    }
  }
}

mymattcarroll avatar Mar 04 '25 05:03 mymattcarroll

Im seeing the same issue Getting the page ref with widget.P() worked for a document i had, but an almost identical document could not find the pageref, is there anything one can do to fix this or mitigate the undefined pageRef?

larseen avatar Nov 10 '25 16:11 larseen