Auto-GPT-Plugins
Auto-GPT-Plugins copied to clipboard
Add a PDF plugin to read pdfs from the browser
⚠️ Search for existing issues first ⚠️
- [X] I have searched the existing issues, and there is no existing issue for my problem
Which Operating System are you using?
Windows
GPT-3 or GPT-4?
GPT-4
Steps to reproduce 🕹
Auto GPT is not able to analyze urls pages what ends with .pdf, for example:
https://www.nasa.gov/sites/default/files/atoms/files/2015_nasa_technology_roadmaps_ta_3_space_power_energy_storage_final.pdf
Every time it is observed the following log
DevTools listening on ws://127.0.0.1:58166/devtools/browser/f20cd7e3-fffc-4243-a948-0ca83d0a31b3 [10212:7268:0419/141719.510:ERROR:device_event_log_impl.cc(222)] [14:17:19.510] USB: usb_device_handle_win.cc:1046 Failed to read descriptor from node connection: Un dispositivo collegato al sistema non Þ in funzione. (0x1F) [10212:7268:0419/141719.517:ERROR:device_event_log_impl.cc(222)] [14:17:19.517] USB: usb_device_handle_win.cc:1046 Failed to read descriptor from node connection: Un dispositivo collegato al sistema non Þ in funzione. (0x1F) SYSTEM: Command browse_website returned: ('Answer gathered from website: Error: No text to summarize \n \n Links: []', <selenium.webdriver.chrome.webdriver.WebDriver (session="8bc1bce877b79a5e19efacaa584b64a2")>)
Current behavior 😯
Auto-GPT is not able to read correctly the webpages that contains .pdf files
Expected behavior 🤔
Auto-GPT should be able to read correctly the webpages that ends with .pdf
Your prompt 📝
# Paste your prompt here
Your Logs 📒
<insert your logs here>
Save the PDF file and open it with Notepad then you'll see why Auto-GPT was unable to read it.
PDFs are a major pain and will likely require a PDF extractor library and OCR to get all the important information out, math equations in particular are difficult.
Could be useful to implement something like that ?
import PyPDF2
#create file object variable #opening method will be rb pdffileobj=open('1.pdf','rb')
#create reader variable that will read the pdffileobj pdfreader=PyPDF2.PdfFileReader(pdffileobj)
#This will store the number of pages of this pdf file x=pdfreader.numPages
#create a variable that will select the selected number of pages pageobj=pdfreader.getPage(x+1)
#(x+1) because python indentation starts with 0. #create text variable which will store all text datafrom pdf file text=pageobj.extractText()
#save the extracted data from pdf to a txt file #we will use file handling here #dont forget to put r before you put the file path #go to the file location copy the path by right clicking on the file #click properties and copy the location path and paste it here. #put "\your_txtfilename" file1=open(r"C:\Users\SIDDHI\AppData\Local\Programs\Python\Python38\1.txt","a") file1.writelines(text)
I can personally recommend PyMuPDF: https://pymupdf.readthedocs.io/en/latest/
def read_pdf(path: str) -> Generator[str, None, None]:
flags = fitz.TEXT_PRESERVE_WHITESPACE | fitz.TEXT_INHIBIT_SPACES | fitz.TEXT_DEHYPHENATE | fitz.TEXT_MEDIABOX_CLIP
with fitz.Document(path) as doc:
for page in doc:
yield page.get_textpage(flags=flags).extractText(sort=True)
seems worth a plugin later on
May be worth doing a title edit to: "Add a PDF plugin to read pdfs from the browser"
ChatPDF is a useful plugin that allows agents to ask questions through chat. I personally uploaded my profile to ChatPDF and it responded to questions effectively. My profile even contains diagrams.
Migrate to Auto-GPT-Plugins
迁移到自动 GPT 插件
https://github.com/Significant-Gravitas/Auto-GPT-Plugins,Is there an instruction manual for this plugin?