LEGENT icon indicating copy to clipboard operation
LEGENT copied to clipboard

Open Platform for Embodied Agents

LEGENT

Online Documentation | Web Demo (Coming Soon) | Paper (Coming Soon)

Introduction

LEGENT is a platform for embodied agent with scalable training pipeline.

In the future, robots will perceive the environment as we do, communicate with us through natural language and help us with our tasks. The platform is dedicated to developing robots that can chat, see, and act from virtual world to real world. We aspire to facilitate the research in this field for anyone interested. LEGENT is a pioneering solution combining large models with embodied agents, prioritizing ease of use and scalability. The platform focuses on developing the following functionalities:

  1. An easy-to-use environment that simulates a physical world, where an agent can interact with humans through language, receive egocentric vision, and perform physical actions.

  2. Automated generation of training data, including the generation of scenes, tasks, and ground truth actions. The platform is tailored to train large multimodal models into embodied models, using generated data from simulated worlds at scale.

LEGENT serves as the data engine for embodied models in robotics and games, as well as for generative world simulators like Sora.

Demonstration

Interact with the embodied agent within realistic scenes.

https://github.com/thunlp/LEGENT/assets/50205889/20657124-e2e6-434f-9315-bcbdce26e1f3

Interact with the embodied agent within stylized scenes.

https://github.com/thunlp/LEGENT/assets/50205889/e667bf3d-1dc5-4ed7-95b7-b3bf6ab60fdf

Scene generation for scalable training.

https://github.com/thunlp/LEGENT/assets/50205889/fafaa02e-1050-4dab-a43f-701bca1477b7

Important Note

LEGENT is currently organizing code and documents, as well as carrying out necessary bug fixes and improvements to existing features. It will be more convenient to use once this process is complete. If you want a more stable version, please stay tuned!