nextjs-pdf-parser
                                
                                 nextjs-pdf-parser copied to clipboard
                                
                                    nextjs-pdf-parser copied to clipboard
                            
                            
                            
                        Next.js template for seamless PDF parsing using pdf2json and FilePond. Ideal for developers seeking a ready-to-use solution for PDF content extraction in Next.js projects.
Next.js PDF Parser Template 📄🔍
https://github.com/tuffstuff9/nextjs-pdf-parser/assets/57072903/c9e5e5eb-ceeb-4947-b26c-11f87bb26312
Introduction
I was having some trouble parsing PDFs in Next.js, so I thought I would make this template for anyone else who was facing the same issues as me. I hope this template saves you some time and trouble. It's a basic create-next-app with PDF parsing implemented using the pdf2json library and file uploading facilitated by FilePond.
Installation & Setup 🚀
- 
Clone the repository: 
- 
git clone [repository-url]
- 
Navigate to the project directory: 
- 
cd nextjs-pdf-parser
- 
Install dependencies: 
- 
Windows only: In app\api\upload\route.tson line 22, changetempFilePathto a valid path. Make sure it starts from the root drive, for example:C:/coding/nextjs-pdf-parser/public/${fileName}.pdf
- 
npm install # or yarn install
- 
Run the development server: npm run dev # or yarn devVisit http://localhost:3000to view the application.
Usage 🖱
Navigate to http://localhost:3000 and use the FilePond uploader to select and upload a PDF. Once uploaded, the content of the PDF is parsed and printed to the server console (Note: it will not be printed to the browser log).
Technical Details 🛠
- 
nodeUtil is not defined Error: To bypass the nodeUtil is not definederror, the following configuration was added tonext.config.js:
const nextConfig = {
  experimental: {
    serverComponentsExternalPackages: ['pdf2json'],
  },
};
module.exports = nextConfig;
See more details here
- 
Blank output from pdfParser.getRawTextContent():This issue might be due to incorrect type definitions. There are two potential solutions: - 
Fix TypeScript definitions: Update the type definition for PDFParser. 
- 
Bypass type checking: Instantiate PDFParser as shown: const pdfParser = new (PDFParser as any)(null, 1);
 For more details, refer to my comment on this GitHub issue. 
- 
Acknowledgements 🙏
A special thanks to the following libraries and their contributors:
- FilePond: For providing a seamless and user-friendly file uploading experience.
- pdf2json: For its efficient and robust PDF parsing capabilities.
License 📜
MIT License