AutoGPT icon indicating copy to clipboard operation
AutoGPT copied to clipboard

Use visual browsing to improve search/browse performance

Open WangXing0801 opened this issue 1 year ago • 4 comments

Duplicates

  • [X] I have searched the existing issues

Summary 💡

抱歉,我的英文不好,所以就使用母语留言了。 首先,我一直在关注autogpt的每一步发展。在使用中我经常创建ai,并要求他们做一些简单事情。 比如我会让ai去网络给我找一张高清地图,或者让ai去网络寻找某个格式的3d建模回来。在使用这种类似寻找资源功能中,我发现ai很难完成任务。我猜测的原因是,虽然ai可以使用浏览器所搜功能,但是无法分辨搜索出来后的网站是否正常显示了需要的搜索内容,网站五花八门,而搜索的结果网站链接很可能已经不同形式的404了,这种时候使用特征判断完全是不可能的。 于是,我想到了前段时间开源的minigpt,视觉神经网络,可不可以让autogpt结合minigpt,对这种搜索资源的功能进行视觉神经判断扫描结果,从而完善做为一个ai的很基本的技能。 另外,现在开源免费的模型很多,虽然gpt3.5或者gpt4对的答案可能更加精准,但是收费并不低,可不可以把调用三方api的环节进行封装,可以进行自定义配置,从而达到免费,本地化的智能私人助理

Examples 🌈

没有

Motivation 🔦

免费。。。

WangXing0801 avatar Apr 25 '23 08:04 WangXing0801

支持,也许我们可以自己尝试一下

qiangziakbar avatar Apr 25 '23 10:04 qiangziakbar

I was interested, and because I am ignorant of Chinese, here is ChatGPT translation:

Summary

Sorry, my English is not good, so I left a message in my native language.

First of all, I have been following the development of Autogpt closely. When using it, I often create AIs and ask them to do some simple tasks. For example, I would ask an AI to search the internet for a high-definition map or to look for a 3D modeling of a certain format. However, I have found that AIs have difficulty completing tasks like finding resources. I suspect that the reason for this is that although AIs can use the search function in browsers, they cannot distinguish whether the websites that come up after the search display the content that is needed. Websites vary greatly, and the links from the search results are likely to have different forms of 404 errors, so using feature detection is completely impossible.

Therefore, I thought of Minigpt, an open-source model that was released some time ago. It is a visual neural network. Can Autogpt combine with Minigpt to perform visual neural network scanning of search results and improve it as a basic skill for an AI to search for resources?

In addition, there are many open-source models available now for free. Although GPT3.5 or GPT4 may provide more accurate answers, they are not cheap. Can the step of calling third-party APIs be encapsulated and customized so that it can achieve the goal of creating a free, localized, intelligent personal assistant?

ghost avatar Apr 25 '23 22:04 ghost

I was interested, and because I am ignorant of Chinese, here is ChatGPT translation:

Summary

Sorry, my English is not good, so I left a message in my native language. First of all, I have been following the development of Autogpt closely. When using it, I often create AIs and ask them to do some simple tasks. For example, I would ask an AI to search the internet for a high-definition map or to look for a 3D modeling of a certain format. However, I have found that AIs have difficulty completing tasks like finding resources. I suspect that the reason for this is that although AIs can use the search function in browsers, they cannot distinguish whether the websites that come up after the search display the content that is needed. Websites vary greatly, and the links from the search results are likely to have different forms of 404 errors, so using feature detection is completely impossible. Therefore, I thought of Minigpt, an open-source model that was released some time ago. It is a visual neural network. Can Autogpt combine with Minigpt to perform visual neural network scanning of search results and improve it as a basic skill for an AI to search for resources? In addition, there are many open-source models available now for free. Although GPT3.5 or GPT4 may provide more accurate answers, they are not cheap. Can the step of calling third-party APIs be encapsulated and customized so that it can achieve the goal of creating a free, localized, intelligent personal assistant?

ChatGPT reduce the gap between different people using different languages 😆

VinciLee1 avatar Apr 26 '23 06:04 VinciLee1

What an interesting use of the tool, will look into this a bit

ntindle avatar Apr 26 '23 06:04 ntindle

My understanding on the suggestion is about enhancing the browse_website command to process any pictures to get the text description using AI model like minigpt. The current browse_website command only care about texts.

kinance avatar May 19 '23 16:05 kinance

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

github-actions[bot] avatar Sep 06 '23 21:09 github-actions[bot]

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

github-actions[bot] avatar Oct 30 '23 01:10 github-actions[bot]

This issue was closed automatically because it has been stale for 10 days with no activity.

github-actions[bot] avatar Nov 09 '23 01:11 github-actions[bot]

This is referenced from #5179 so we won't forget about it :)

Pwuts avatar Nov 15 '23 16:11 Pwuts