PyPDF2
PyPDF2 copied to clipboard
getNumPages fails on encrypted PDF
I'm not an expert on the PDF file format but I think that PDF files contains a "/Page" instruction for each page in it, and this is visible even if the file is protected.
Also, there is the "/Type /Pages" instruction that give a "/Count" of the number of pages of the document that is visible even on a protected file too.
So why is the getNumPages method so complicated? What am I missing?
You are correct. This is the most logical answer, since PDF's are required to include a root Document Catalog, which in turn is required to have a Page Tree dictionary which contains the number of pages (current as of PDF 1.7). getNumPages() should now work with encrypted PDF's.