Awesome-Embodied-AI
Awesome-Embodied-AI copied to clipboard

Published 20 hours ago •

→

Metadata

Readme
Issues

Awesome-Embodied-AI

Scene Understanding

Image

	Description	Paper	Code
SAM	Segmentation	https://arxiv.org/abs/2304.02643	https://github.com/facebookresearch/segment-anything
YOLO-World	Open-Vocabulary Detection	https://arxiv.org/abs/2401.17270	https://github.com/AILab-CVC/YOLO-World

Point Cloud

	Description	Paper	Code
SAM3D	Segmentation	https://arxiv.org/abs/2306.03908	https://github.com/Pointcept/SegmentAnything3D
PointMixer	Understanding	https://arxiv.org/abs/2401.17270	https://github.com/LifeBeyondExpectations/PointMixer

Multi-Modal Grounding

	Description	Paper	Code
GPT4V	MLM(Image+Language->Language)	https://arxiv.org/abs/2303.08774
Claude3-Opus	MLM(Image+Language->Language)	https://www.anthropic.com/news/claude-3-family
GLaMM	Pixel Grounding	https://arxiv.org/abs/2311.03356	https://github.com/mbzuai-oryx/groundingLMM
All-Seeing	Pixel Grounding	https://arxiv.org/abs/2402.19474	https://github.com/OpenGVLab/all-seeing
LEO	3D	https://arxiv.org/abs/2311.12871	https://github.com/embodied-generalist/embodied-generalist

Data Collection

From Video

	Description	Paper	Code
Vid2Robot		https://vid2robot.github.io/vid2robot.pdf
RT-Trajectory		https://arxiv.org/abs/2311.01977
MimicPlay		https://mimic-play.github.io/assets/MimicPlay.pdf	https://github.com/j96w/MimicPlay

Hardware

	Description	Paper	Code
UMI	Two-Fingers	https://arxiv.org/abs/2402.10329	https://github.com/real-stanford/universal_manipulation_interface
DexCap	Five-Fingers	https://dex-cap.github.io/assets/DexCap_paper.pdf	https://github.com/j96w/DexCap
HIRO Hand	Hand-over-hand	https://sites.google.com/view/hiro-hand

Generative Simulation

	Description	Paper	Code
MimicGen		https://arxiv.org/abs/2310.17596	https://github.com/NVlabs/mimicgen_environments
RoboGen		https://arxiv.org/abs/2311.01455	https://github.com/Genesis-Embodied-AI/RoboGen

Action Output

Generative Imitation Learning

	Description	Paper	Code
Diffusion Policy		https://arxiv.org/abs/2303.04137	https://github.com/real-stanford/diffusion_policy
ACT		https://arxiv.org/abs/2304.13705	https://github.com/tonyzhaozh/act

Affordance Map

	Description	Paper	Code
CLIPort	Pick&Place	https://arxiv.org/pdf/2109.12098.pdf	https://github.com/cliport/cliport
Robo-Affordances	Contact&Post-contact trajectories	https://arxiv.org/abs/2304.08488	https://github.com/shikharbahl/vrb
Robo-ABC		https://arxiv.org/abs/2401.07487	https://github.com/TEA-Lab/Robo-ABC
Where2Explore	Few shot learning from semantic similarity	https://proceedings.neurips.cc/paper_files/paper/2023/file/0e7e2af2e5ba822c9ad35a37b31b5dd4-Paper-Conference.pdf
Move as You Say, Interact as You Can	Affordance to motion from diffusion model	https://arxiv.org/pdf/2403.18036.pdf
AffordanceLLM	Grounding affordance with LLM	https://arxiv.org/pdf/2401.06341.pdf
Environment-aware Affordance		https://proceedings.neurips.cc/paper_files/paper/2023/file/bf78fc727cf882df66e6dbc826161e86-Paper-Conference.pdf
OpenAD	Open-Voc Affordance Detection from point cloud	https://www.csc.liv.ac.uk/~anguyen/assets/pdfs/2023_OpenAD.pdf	https://github.com/Fsoft-AIC/Open-Vocabulary-Affordance-Detection-in-3D-Point-Clouds
RLAfford	End-to-End affordance learning with RL	https://gengyiran.github.io/pdf/RLAfford.pdf
General Flow	Collect affordance from video	https://general-flow.github.io/general_flow.pdf	https://github.com/michaelyuancb/general_flow
PreAffordance	Pre-grasping planning	https://arxiv.org/pdf/2404.03634.pdf
ScenFun3d	Fine-grained functionality&affordance in 3D scene	https://aycatakmaz.github.io/data/SceneFun3D-preprint.pdf	https://github.com/SceneFun3D/scenefun3d

Question&Answer from LLM

	Description	Paper	Code
COPA		https://arxiv.org/abs/2403.08248
ManipLLM		https://arxiv.org/abs/2312.16217
ManipVQA		https://arxiv.org/pdf/2403.11289.pdf	https://github.com/SiyuanHuang95/ManipVQA

Language Corrections

	Description	Paper	Code
OLAF		https://arxiv.org/pdf/2310.17555
YAYRobot		https://arxiv.org/abs/2403.12910	https://github.com/yay-robot/yay_robot

Planning from LLM

	Description	Paper	Code
SayCan	API Level	https://arxiv.org/abs/2204.01691	https://github.com/google-research/google-research/tree/master/saycan
VILA	Prompt Level	https://arxiv.org/abs/2311.17842

About

80

Stars

4

Forks

Watchers

Owner

← Metadata

80

Stars

4

Forks

Watchers

Owner

Metadata