Awesome-Embodied-AI
Awesome-Embodied-AI copied to clipboard
Awesome-Embodied-AI
Scene Understanding
Image
| Description | Paper | Code | |
|---|---|---|---|
| SAM | Segmentation | https://arxiv.org/abs/2304.02643 | https://github.com/facebookresearch/segment-anything |
| YOLO-World | Open-Vocabulary Detection | https://arxiv.org/abs/2401.17270 | https://github.com/AILab-CVC/YOLO-World |
Point Cloud
| Description | Paper | Code | |
|---|---|---|---|
| SAM3D | Segmentation | https://arxiv.org/abs/2306.03908 | https://github.com/Pointcept/SegmentAnything3D |
| PointMixer | Understanding | https://arxiv.org/abs/2401.17270 | https://github.com/LifeBeyondExpectations/PointMixer |
Multi-Modal Grounding
| Description | Paper | Code | |
|---|---|---|---|
| GPT4V | MLM(Image+Language->Language) | https://arxiv.org/abs/2303.08774 | |
| Claude3-Opus | MLM(Image+Language->Language) | https://www.anthropic.com/news/claude-3-family | |
| GLaMM | Pixel Grounding | https://arxiv.org/abs/2311.03356 | https://github.com/mbzuai-oryx/groundingLMM |
| All-Seeing | Pixel Grounding | https://arxiv.org/abs/2402.19474 | https://github.com/OpenGVLab/all-seeing |
| LEO | 3D | https://arxiv.org/abs/2311.12871 | https://github.com/embodied-generalist/embodied-generalist |
Data Collection
From Video
| Description | Paper | Code | |
|---|---|---|---|
| Vid2Robot | https://vid2robot.github.io/vid2robot.pdf | ||
| RT-Trajectory | https://arxiv.org/abs/2311.01977 | ||
| MimicPlay | https://mimic-play.github.io/assets/MimicPlay.pdf | https://github.com/j96w/MimicPlay |
Hardware
| Description | Paper | Code | |
|---|---|---|---|
| UMI | Two-Fingers | https://arxiv.org/abs/2402.10329 | https://github.com/real-stanford/universal_manipulation_interface |
| DexCap | Five-Fingers | https://dex-cap.github.io/assets/DexCap_paper.pdf | https://github.com/j96w/DexCap |
| HIRO Hand | Hand-over-hand | https://sites.google.com/view/hiro-hand |
Generative Simulation
| Description | Paper | Code | |
|---|---|---|---|
| MimicGen | https://arxiv.org/abs/2310.17596 | https://github.com/NVlabs/mimicgen_environments | |
| RoboGen | https://arxiv.org/abs/2311.01455 | https://github.com/Genesis-Embodied-AI/RoboGen |
Action Output
Generative Imitation Learning
| Description | Paper | Code | |
|---|---|---|---|
| Diffusion Policy | https://arxiv.org/abs/2303.04137 | https://github.com/real-stanford/diffusion_policy | |
| ACT | https://arxiv.org/abs/2304.13705 | https://github.com/tonyzhaozh/act |
Affordance Map
Question&Answer from LLM
| Description | Paper | Code | |
|---|---|---|---|
| COPA | https://arxiv.org/abs/2403.08248 | ||
| ManipLLM | https://arxiv.org/abs/2312.16217 | ||
| ManipVQA | https://arxiv.org/pdf/2403.11289.pdf | https://github.com/SiyuanHuang95/ManipVQA |
Language Corrections
| Description | Paper | Code | |
|---|---|---|---|
| OLAF | https://arxiv.org/pdf/2310.17555 | ||
| YAYRobot | https://arxiv.org/abs/2403.12910 | https://github.com/yay-robot/yay_robot |
Planning from LLM
| Description | Paper | Code | |
|---|---|---|---|
| SayCan | API Level | https://arxiv.org/abs/2204.01691 | https://github.com/google-research/google-research/tree/master/saycan |
| VILA | Prompt Level | https://arxiv.org/abs/2311.17842 |