AutoGPT icon indicating copy to clipboard operation
AutoGPT copied to clipboard

add huggingface image-to-text api for image description

Open neuralsignal opened this issue 2 years ago β€’ 2 comments
trafficstars

Background

Addition of a image description/summary command that uses the huggingface inference API to call a image to text model. Mainly intended when not using GPT4, but GPT3.5.

Changes

Added a new python file called image_text.py that contains the summarize_image function, which call the huggingface API. Made changes to prompt.py, the env template, config.py, and commands.py to include the summarize_image function and make it functional.

Documentation

The changes are implemented as just in code comments.

Test Plan

I tested the changes by added various image to the working directory or making the autogpt copy images from the web into the working directory. Then I asked the agent to summarize the images, and create a story.

PR Quality Checklist

  • [x] My pull request is atomic and focuses on a single change.
  • [x] I have thoroughly tested my changes with multiple different prompts.
  • [x] I have considered potential risks and mitigations for my changes.
  • [x] I have documented my changes clearly and comprehensively.
  • [x] I have not snuck in any "extra" small tweaks changes

neuralsignal avatar Apr 15 '23 17:04 neuralsignal

@gucky92 There are conflicts

nponeccop avatar Apr 15 '23 17:04 nponeccop

Thank you @nponeccop! I resolved the conflicts to be inline with the new structure of the project.

neuralsignal avatar Apr 15 '23 18:04 neuralsignal

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

github-actions[bot] avatar Apr 17 '23 22:04 github-actions[bot]

Conflicts have been resolved! πŸŽ‰ A maintainer will review the pull request shortly.

github-actions[bot] avatar Apr 18 '23 09:04 github-actions[bot]

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

github-actions[bot] avatar Apr 19 '23 00:04 github-actions[bot]

This is a mass message from the AutoGPT core team. Our apologies for the ongoing delay in processing PRs. This is because we are re-architecting the AutoGPT core!

For more details (and for infor on joining our Discord), please refer to: https://github.com/Significant-Gravitas/Auto-GPT/wiki/Architecting

p-i- avatar May 05 '23 00:05 p-i-

Conflicts have been resolved! πŸŽ‰ A maintainer will review the pull request shortly.

github-actions[bot] avatar Jul 07 '23 06:07 github-actions[bot]

Deploy Preview for auto-gpt-docs canceled.

Name Link
Latest commit ee122cdad56529bb043c73ebb4ca5ea06d46db09
Latest deploy log https://app.netlify.com/sites/auto-gpt-docs/deploys/64b7d41ebf0ff70008307316

netlify[bot] avatar Jul 07 '23 06:07 netlify[bot]

Codecov Report

Patch coverage: 5.00% and project coverage change: -0.21 :warning:

Comparison is base (a758ace) 51.02% compared to head (720f83a) 50.81%.

:exclamation: Current head 720f83a differs from pull request most recent head ee122cd. Consider uploading reports for the commit ee122cd to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1644      +/-   ##
==========================================
- Coverage   51.02%   50.81%   -0.21%     
==========================================
  Files         118      119       +1     
  Lines        4898     4904       +6     
  Branches      649      646       -3     
==========================================
- Hits         2499     2492       -7     
- Misses       2215     2231      +16     
+ Partials      184      181       -3     
Impacted Files Coverage Ξ”
autogpt/commands/image_text.py 0.00% <0.00%> (ΓΈ)
autogpt/config/config.py 80.72% <100.00%> (+0.11%) :arrow_up:

... and 4 files with indirect coverage changes

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Jul 07 '23 06:07 codecov[bot]

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

github-actions[bot] avatar Jul 08 '23 15:07 github-actions[bot]

Conflicts have been resolved! πŸŽ‰ A maintainer will review the pull request shortly.

github-actions[bot] avatar Jul 08 '23 22:07 github-actions[bot]

You changed AutoGPT's behaviour. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged.

Auto-GPT-Bot avatar Jul 08 '23 22:07 Auto-GPT-Bot

You changed AutoGPT's behaviour. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged.

Auto-GPT-Bot avatar Jul 10 '23 18:07 Auto-GPT-Bot

@gucky92 any chance you could add an integration test for this functionality?

Pwuts avatar Jul 10 '23 21:07 Pwuts

@Pwuts I haven't worked on this repo for a bit, but I will try to get an integration test committed over the weekend

neuralsignal avatar Jul 14 '23 21:07 neuralsignal

@Pwuts I have added a basic integration test. Currently in the test I load images into the workspace; i don't know if we want to save test images in the repo or do this some other way. There are also other tests we could add similar to the generate_image tests for example.

neuralsignal avatar Jul 16 '23 14:07 neuralsignal

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

github-actions[bot] avatar Sep 06 '23 15:09 github-actions[bot]

Closing old PRS

Swiftyos avatar Jun 28 '24 11:06 Swiftyos