poppler-windows icon indicating copy to clipboard operation
poppler-windows copied to clipboard

PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?

Open KaifAhmad1 opened this issue 1 year ago • 0 comments

Here is my issue in brief '''

import os

poppler_path = 'C:\\Users\\Mohd Kaif\\Downloads\\poppler-23.08.0\\Library\\bin'
os.environ["PATH"] += os.pathsep + poppler_path
directory = '/content/drive/MyDrive/History_QA_dataset'
from pathlib import Path

def load_files(directory):
    documents = list(Path(directory).iterdir())
    return documents

documents = load_files(directory)
print(len(documents))
documents
from langchain_community.document_loaders import UnstructuredPDFLoader
loader = UnstructuredPDFLoader("/content/drive/MyDrive/History_QA_dataset/ncert_s_modern_india_bipan_chandra_old_edition-1566975158976.pdf")
pages = loader.load()

Rasing this Type Error:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/pdf2image/pdf2image.py](https://localhost:8080/#) in pdfinfo_from_path(pdf_path, userpw, ownerpw, poppler_path, rawdates, timeout, first_page, last_page)
    580             env["LD_LIBRARY_PATH"] = poppler_path + ":" + env.get("LD_LIBRARY_PATH", "")
--> 581         proc = Popen(command, env=env, stdout=PIPE, stderr=PIPE)
    582 

14 frames
FileNotFoundError: [Errno 2] No such file or directory: 'pdfinfo'

During handling of the above exception, another exception occurred:

PDFInfoNotInstalledError                  Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/pdf2image/pdf2image.py](https://localhost:8080/#) in pdfinfo_from_path(pdf_path, userpw, ownerpw, poppler_path, rawdates, timeout, first_page, last_page)
    605 
    606     except OSError:
--> 607         raise PDFInfoNotInstalledError(
    608             "Unable to get page count. Is poppler installed and in PATH?"
    609         )

PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?

Additional Information: Python version: 3.10.10 Operating System: Windows 11

KaifAhmad1 avatar Jan 20 '24 11:01 KaifAhmad1