HumanPoseMemo
HumanPoseMemo copied to clipboard
Memo about 3d human pose estimation, record of datasets, papers, codes.
HumanPoseMemo
Memo about 3D human pose estimation, record of datasets, papers, codes.
Datasets
2D datasets
related works

3D datasets
SMPL datasets
Dressed datasets
Face, Hands and Feet
related works
Papers
note: I don't include some paper without codes.
- Before 2020
- CVPR 2020
- ECCV 2020
Before 2020
Monocular human pose estimation
2019
- Learning 3D Human Shape and Pose from Dense Body Parts
- CVPR, 19. Learning 3D Human Dynamics from Video
- ICCV, 19. TexturePose: Supervising Human Mesh Estimation with Texture Consistency
- ICCV, 19. SPIN - SMPL oPtimization IN the loop
- ICCV, 19. Delving Deep Into Hybrid Annotations for 3D Human Recovery in the Wild
- ICCV, 19. Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image
- CVPR, 19. Exploiting temporal context for 3D human pose estimation in the wild
- CVPR, 19. Learning Joint Reconstruction of Hands and Manipulated Objects - Demo, Training Code and Models
- ICCV, 19. MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation
- SIGGRAPH Asia, 18. Motion Reconstruction Code and Data for Skills from Videos (SFV)
- CVPR, 19. Monocular Total Capture: Posing Face, Body and Hands in the Wild
- CVPR, 19. Detailed Human Shape Estimation from a Single Image by Hierarchical Mesh Deformation
- CVPR, 19. Convolutional Mesh Regression for Single-Image Human Shape Reconstruction
- CVPR, 19. Self-Supervised Learning of 3D Human Pose using Multi-view Geometry
- CVPR, 19. 3D human pose estimation in video with temporal convolutions and semi-supervised training
2018
Multi-view human pose estimation
2019
2018
Detailed human shape reconstruction
2019
- ICCV 19, PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization:[code]
- Learning Nonparametric Human Mesh Reconstruction from a Single Image without Ground Truth Meshes:image => 2D pose + part seg ==Graph-CNN==> mesh
- PeelNet: Textured 3D reconstruction of human body using single view RGB image
- CVPR, 19. Dense Intrinsic Appearance Flow for Human Pose Transfer
- ICCV, 19. Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis
- CVPR, 19. Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision
- ICCV, 19. Multi-Garment Net: Learning to Dress 3D People from Images
2018
Multi-View Stereo
Other
2019
2018
CVPR2020
keywords: human, motion, tracking, person, pose
2D human pose
- Combining Detection and Tracking for Human Pose Estimation in Videos
- MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation
- HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation
- The Devil Is in the Details: Delving Into Unbiased Data Processing for Human Pose Estimation
- Distribution-Aware Coordinate Representation for Human Pose Estimation
- CVPR 20, Hierarchical Human Parsing with Typed Part-Relation Reasoning:[code]
monocular 3D pose
- VIBE: Video Inference for Human Body Pose and Shape Estimation]
- 3D Human Mesh Regression with Dense Correspondence [code]
- Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation:[code]
- Deep Kinematics Analysis for Monocular 3D Human Pose Estimation
- Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction[oral, code]
- Weakly-Supervised 3D Human Pose Learning via Multi-View Images in the Wild
- Coherent Reconstruction of Multiple Humans From a Single Image
- Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis[oral, project]
- Cascaded Deep Monocular 3D Human Pose Estimation With Evolutionary Training Data[oral]
- GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models[oral]
- Generating 3D People in Scenes Without People[oral]
- Bodies at Rest: 3D Human Pose and Shape Estimation From a Pressure Image Using Synthetic Data[oral]
- Multiview-Consistent Semi-Supervised Learning for 3D Human Pose Estimation
- Optical Non-Line-of-Sight Physics-Based 3D Human Pose Estimation
- UniPose: Unified Human Pose Estimation in Single Images and Videos
- Three-Dimensional Reconstruction of Human Interactions
- Sequential 3D Human Pose and Shape Estimation From Point Clouds
- Object-Occluded Human Shape and Pose Estimation From a Single Color Image[oral]
- PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation
- Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data:[code]
multi view
- ActiveMoCap: Optimized Viewpoint Selection for Active Human Motion Capture
- Multi-View Neural Human Rendering
- Fusing Wearable IMUs With Multi-View Images for Human Pose Estimation: A Geometric Approach
- Cross-View Tracking for Multi-Human 3D Pose Estimation at Over 100 FPS
- 4D Association Graph for Realtime Multi-Person Motion Capture Using Multiple Video Cameras
- Deep 3D Capture: Geometry and Reflectance From Sparse Multi-View Images
- Lightweight Multi-View 3D Pose Estimation Through Camera-Disentangled Representation
depth, detailed, cloth
- PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization:[code]
- Self-Supervised Human Depth Estimation From Monocular Videos
- ARCH: Animatable Reconstruction of Clothed Humans
- DeepCap: Monocular Human Performance Capture Using Weak Supervision
- TetraTSDF: 3D Human Reconstruction From a Single Image With a Tetrahedral Outer Shell
- Learning to Transfer Texture From Clothing Images to 3D Humans
- TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style[oral]
- Novel View Synthesis of Dynamic Scenes With Globally Coherent Depths From a Monocular Camera
- 4D Visualization of Dynamic Events From Unconstrained Multi-View Videos
- Multi-View Neural Human Rendering
HH, HO interactions
- Discovering Human Interactions With Novel Objects via Zero-Shot Learning
- Mixture Dense Regression for Object Detection and Human Pose Estimation
- VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
- PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection
- Learning Human-Object Interaction Detection Using Interaction Points
- Cascaded Human-Object Interaction Recognition
- GanHand: Predicting Human Grasp Affordances in Multi-Object Scenes
- Detailed 2D-3D Joint Representation for Human-Object Interaction
action, tracking, trajectory, prediction
- Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction
- Active Vision for Early Recognition of Human Actions
- Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition
- [Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction][code]
- Reciprocal Learning Networks for Human Trajectory Prediction
- PaStaNet: Toward Human Activity Knowledge Engine
- A Stochastic Conditioning Scheme for Diverse Human Motion Prediction
- Bayesian Adversarial Human Motion Synthesis[oral]
- Learning Dynamic Relationships for 3D Human Motion Prediction
- Context-Aware Human Motion Prediction
- Learning a Neural Solver for Multiple Object Tracking[oral]
- Skeleton-Based Action Recognition With Shift Graph Convolutional Network
- Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition
face, hand
- Understanding Human Hands in Contact at Internet Scale
- AvatarMe: Realistically Renderable 3D Facial Reconstruction “In-the-Wild”
- Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild
- Deep Facial Non-Rigid Multi-View Stereo
- Can Facial Pose and Expression Be Separated With Weak Perspective Camera?
dataset
- HUMBI: A Large Multiview Dataset of Human Body Expressions
- PANDA: A Gigapixel-Level Human-Centric Video Dataset
- HOnnotate: A Method for 3D Annotation of Hand and Object Poses
some interesting works
- End-to-End Camera Calibration for Broadcast Videos
- Transferring Dense Pose to Proximal Animal Classes
- Dynamic Graph Message Passing Networks
- Self-Learning Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence
- Learning to Optimize Non-Rigid Tracking
- SuperGlue: Learning Feature Matching With Graph Neural Networks
- Spatial-Temporal Graph Convolutional Network for Video-Based Person Re-Identification
- Minimal Solutions to Relative Pose Estimation From Two Views Sharing a Common Direction With Unknown Focal Length
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis: [code],[code-PyTorch]
- Learning Character-Agnostic Motion for Motion Retargeting in 2D Decompose and recompose the video, could be used for motion retrival.
ECCV2020
2D human pose
- Peeking into occluded joints:A novel framework for crowd pose estimation[code]
- Differentiable Hierarchical Graph Grouping forMulti-Person Pose Estimation
- Whole-Body Human Pose Estimation in the Wild
- Self-supervised Keypoint Correspondences for Multi-Person Pose Estimation and Tracking in Videos
- SimPose: Effectively Learning DensePose andSurface Normals of People from Simulated Data
3D human pose
- Contact and Human Dynamics from Monocular Video
- HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization
- HMOR: Hierarchical Multi-person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation
- 3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning
- I2L-MeshNet: Image-to-Lixel PredictionNetwork for Accurate 3D Human Pose andMesh Estimation from a Single RGB Image[code]
- Full-Body Awareness from Partial Observations
- Towards Part-aware Monocular 3D Human Pose Estimation: An Architecture Search Approach
multi-person 3d
multi-view
- Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry
- End-to-End Estimation of Multi-Person 3D Poses from Multiple Cameras[code]
- Unsupervised Cross-Modal Alignment forMulti-Person 3D Pose Estimation[project]
action
- Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition
- Hidden Footprints: Learning ContextualWalkability from 3D Human Trails
- MotionSqueeze: Neural Motion FeatureLearning for Video Understanding
- Structure-Aware Human-Action Generation
face, hand, detailed human
- Self-Supervised Monocular 3D FaceReconstruction by Occlusion-AwareMulti-view Geometry Consistency
- Combining Implicit Function Learning andParametric Models for 3D HumanReconstruction
HO
Other
- Human Interaction Learning on 3D Skeleton Point Clouds for Video Violence Recognition
- Adaptive Computationally Efficient Network for Monocular 3D Hand Pose Estimation
- Long-term Human Motion Prediction with Scene Context
- Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video
- Appearance Consensus Driven Self-Supervised Human Mesh Recovery
- End-to-end Dynamic Matching Network for Multi-view Multi-person 3d Pose Estimation
- Deep Graph Matching via BlackboxDifferentiation of Combinatorial Solvers
- Accurate Optimization of Weighted NuclearNorm for Non-Rigid Structure from Motion
- Aligning Videos in Space and Time
- Dense Hybrid Recurrent Multi-view Stereo Netwith Dynamic Consistency Checking
- DeepSFM: Structure From Motion Via DeepBundle Adjustment
- A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem
- Multi-View Optimization ofLocal Feature Geometry
- DeepFit: 3D Surface Fitting via Neural NetworkWeighted Least Squares
submit to CVPR21
- Human Mesh Recovery from Multiple Shots
- NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras
- Reconstructing Hand-Object Interactions in the Wild
CVPR2021
- oral, Reconstructing 3D Human Pose by Watching Humans in the Mirror | Project Page
- Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans | Project Page
2D Pose
- Deep Dual Consecutive Network for Human Pose Estimation
- Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
3D Pose
- oral, Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization
- Monocular Real-time Full Body Capture with Inter-part Correlations
- End-to-End Human Pose and Mesh Reconstruction with Transformers
- Probabilistic 3D Human Shape and Pose Estimation from Multiple Unconstrained Images in the Wild
- Graph Stacked Hourglass Networks for 3D Human Pose Estimation
- Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction
- Semi-supervised Synthesis of High-Resolution Editable Textures for 3D Humans
- oral, SimPoE: Simulated Character Control for 3D Human Pose Estimation
- PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation
Multi-person
- Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks | Code
- Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo
- Body Meshes as Points
- AGORA: Avatars in Geography Optimized for Regression Analysis
Reconstruction
- SMPLicit: Topology-aware Generative Model for Clothed People
- oral, POSEFusion: Pose-guided Selective Fusion for Single-view Human Volumetric Capture
- oral, SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks
- oral, Pixel Codec Avatars
- SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements
- Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration
- StylePeople: A Generative Model of Fullbody Human Avatars
- Temporal Consistency Loss for High Resolution Textured and Clothed 3DHuman Reconstruction from Monocular Video
- Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors
- LASR: Learning Articulated Shape Reconstruction from a Monocular Video | Code
Human-object
Action
- We are More than Our Joints: Predicting how 3D Bodies Move
- Motion Representations for Articulated Animation | Code
- 3D Human Action Representation Learning via Cross-View Consistency Pursuit
other
- NeRD: Neural 3D Reflection Symmetry Detector
- Monocular Real-time Full Body Capture with Inter-part Correlations
- CVPR21, oral, Learning High Fidelity Depths of Dressed Humansby Watching Social Media Dance Videos: self-supervised from TikTok videos to estimate high fidelity depths of dressed humans from a single view image.
- CVPR21, Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors: 通过IMU与头戴相机,恢复出人在场景中的位置和姿态
Resources
Other
- Mixamo
- fairmotion: Tools to load, process and visualize motion capture data
- Deep-motion-editing: contains code of visualization in blender
Contribute
You can contribute to this repor by fork and pull.
You can also see Awesome Human Pose Estimation, awesome-3d-human