AutoGPT icon indicating copy to clipboard operation
AutoGPT copied to clipboard

Expanding Auto-GPT to accept multi-modality input to reach humanoid robot level

Open bharathraja opened this issue 2 years ago • 4 comments

Duplicates

  • [X] I have searched the existing issues

Summary 💡

Can the auto-GPT be expanded to take in multi-modality input such as image, audio, touch and can act through the humanoid robot body ? The modularization of image object recognition, audio-text processing and touch based input tokenisation into text format would integrate all senses. This would make truly autonomous humanoid robots. this can be tested in the simulated environments like OpenAI gym initially.

There are literature which has expanded GPT ability to human action sequences, like this one: https://actiongpt.github.io/

Examples 🌈

No response

Motivation 🔦

i-Robot movie

bharathraja avatar Apr 27 '23 09:04 bharathraja

GPT can't generate text anywhere near fast enough to react to a real-time environment, and if it could then we'd be broke from the token costs of generating detailed actions 60 times a second.

zachary-kaelan avatar Apr 27 '23 21:04 zachary-kaelan

This isn’t the worst idea, could be done if the gradio tools plug-in is fixed up a bit

ntindle avatar Apr 27 '23 21:04 ntindle

I have embeddings fever and it's exciting to see that we got a latent space for motions going, but multimodal, spatiotemporal autoencoding and decoding isn't cheap.

zachary-kaelan avatar Apr 27 '23 22:04 zachary-kaelan

this should probably be renamed to something like "multi modality" ?

Boostrix avatar May 01 '23 09:05 Boostrix

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

github-actions[bot] avatar Sep 06 '23 21:09 github-actions[bot]

This issue was closed automatically because it has been stale for 10 days with no activity.

github-actions[bot] avatar Sep 19 '23 01:09 github-actions[bot]