Videos-Publications-Collection
Videos-Publications-Collection copied to clipboard
This is a collection of publications about videos.
Videos Publications Collectioin
Humans are born to see, and to adapt to this visual world. After the visual signal stimulates the neurons, we learn concepts. we associate one thing with another, seeing waterfall we think about the galaxy, we imagine, and we finally create, updating this visual world. And some of us are trying to gift this ability to intelligent agent, leading an unprecedented scientific trend.
This is a collection of video publications I have recently read, including Action Recognition, Video Generation, Video Self-supervised Learning and some classical papers, etc..
This repo will keep updating during my research.
Video Generation
DVDGAN
SV2P
SAVP
SVG-LP
Vid2Vid
TGAN
Generating Videos with Scene Dynamics
Generating the Futures with Adversarial Transformers
-
Video Disentanglement
RecycleGAN
Deep Visual Analogy-Making
Unsupervised Learning of Disentangled Representations from Video
-
Future Prediction
Hierarchical Long-term Video Prediction without Supervision
Compositional Video Prediction
An Uncertain Future: Forecasting from Static Images using Variational Autoencoders
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics
Video Self-supervised Learning
Learning Correspondence from the Cycle-consistency of Time
Learning and Using the Arrow of Time
Self-supervised Learning for Video Correspondence Flow
Temporal Cycle-Consistency Learning
Tracking Emerges by Colorizing Videos
Video Representation Learning by Dense Predictive Coding
Shuffle and Learn
Odd-One-Out
Action Recognition & Representation Learning
Two-Stream Fusion Network
Delving Deeper into Convolutional Networks for Learning Video Representations
Architecture
Spatio-temporal Video Autoencoder with Differentiable Memory
Temporal Consistency
Blind Video Temporal Consistency via Deep Video Prior
Blind video temporal consistency
Learning blind video temporal consistency
Occlusion-aware video temporal consistency
Video Inpainting
Copy-and-Paste
Deep Video Inpainting
Deep Flow-Guided Video Inpaiting
Onion-Peel Network
Free-Form Video Inpaiting with 3D Gated Convolution and Temporal PatchGAN
Learnable Gated Temporal Shift Module for Video Inpaiting
Video Inpaiting by Jointly Learning Temporal Structure and Spatial Details
Deep Blind Video Decaptioning by Temporal Aggregation and Recurrence
Learning Joint Spatial-Temporal Transformations for Video Inpainting
Spatio-Temporal Reasoning
Temporal Relational Reasoning in Videos
Videos as Space-Time Region Graphs
Structural-RNN: Deep Learning on Spatio-Temporal Graphs
Relational Action Forecasting
Learning Human-Object Interactions by Graph Parsing Neural Networks
Optical Flow
FlowNet
SfM
Unsupervised Learning of Depth and Ego-Motion from Video
Video Interpolation
All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced Motion Modeling
Deep Slow Motion Video Reconstruction with Hybrid Imaging System
Depth-Aware Video Frame Interpolation
Temporal Coherence
Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video
Learning Blind Video Temporal Consistency
Multi-modalities
Learning to Learn Words from Visual Scenes
Self-supervised Moving Vehicle Tracking with Stereo Sound
Music Gesture for Visual Sound Separation
Self-supervised Audio-visual Co-segmentation
Labelling Unlabelled Videos from Scratch With Multi-modal Self-supervision
Listen to Look: Action Recognition by Previewing Audio
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Sound2Sight: Generating Visual Dynamics from Sound and Context
Multimodal Speech Separation
Looking to Listen at the Cocktail Party
Blind Audio-Visual Source Separation based on Sparse Redundant Representations
Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks
Video Object Segmentation
Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks
Visual Dialog & Visual Question Answering
Reasoning Visual Dialogs with Structural and Partial Observations
Key-Point & Skeleton
Convolutional Sequence Generation for Skeleton-Based Action Synthesis
Unsupervised Keypoint Learning for Guiding Class-Conditional Video Prediction
Classic
Video Textures
Others
What Makes a Video a Video