superagent icon indicating copy to clipboard operation
superagent copied to clipboard

Add OCR to datasources that are images containing text

Open p6l-richard opened this issue 1 year ago • 1 comments

⚠️ Please check that this feature request hasn't been suggested before.

  • [X] I searched previous Ideas in Discussions didn't find any similar feature requests.
  • [X] I searched previous Issues didn't find any similar feature requests.

🔖 Feature description

  • If a user uploads a datasource that's actually an image containing text (think: a pdf file that's a scanned document or an image of a book page)
  • superagent OCRs the file to extract the text
  • so that the agent can read the extracted text of the datasource

📝 Additional Context

This currently results in the agent saying it doesn't have access to the data source, here's a screen recording: https://discord.com/channels/1110910277110743103/1167829464189779999/1167829464189779999

Acknowledgements

  • [X] My issue title is concise and descriptive.
  • [X] I have searched the existing issues to make sure this feature has not been requested yet.
  • [X] I have provided enough information for the maintainers to understand and evaluate this request.

p6l-richard avatar Oct 28 '23 14:10 p6l-richard

@p6l-richard What do you think about adding a flag to the datasource in which a user can enable using OCR?

homanp avatar Oct 30 '23 08:10 homanp