pdf-keywords-extractor
pdf-keywords-extractor copied to clipboard
🤖 PDF Keywords Extractor 🤖
What is it?
An automation that automatically identifies whether the given PDFs contain the specified keywords, outputting the result as a CSV file.
Show Me!
https://user-images.githubusercontent.com/10613140/161422053-14d1a21a-1018-47d2-aeed-79e702d0eff6.mp4
How to use the PDF Keywords Extractor
Via User-Interface (for technical and non-technical users)
Prerequisites:
- A Robocorp account – necessary to download the assistant, available under the free plan without needing to provide a credit card
- Robocorp Assistant
Once downloaded and installed, click on Install a community assistant and paste in the URL of this repository: https://github.com/bendersej/pdf-keywords-extractor
.
Via Command-Line (for technical users)
Prerequisite:
Place yourself at the root of this folder and run the following command:
rcc run
Known issues
Extracting the text from big PDFs files currently takes a significant amount of time.
For example, it takes roughly 1 min and 10s to extract keywords for The Count of Monte Cristo.
Contributing
Via Pull Request
Feel free to open a new pull request with your proposed feature.
Via Issue
If you don't have the skills or the time, feel free to open an Issue describing the feature you would like to be implemented.