TSGV-Learning-List icon indicating copy to clipboard operation
TSGV-Learning-List copied to clipboard

Temporal Sentence Grounding in Videos / Natural Language Video Localization / Video Moment Retrieval的相关工作

TSGV-Learning-List

说明

总结了2017年至今在TSGV方向上的相关工作。 Temporal Sentence Grounding in Videos (TSGV) Natural Language Video Localization (NLVL) Video Moment Retrieval (VMR) 该任务的目标是给定一段语言描述,在一个未经裁剪的长视频中定位出该语言所描述的视频片段。

目录

  • 数据集
  • 相关工作
    • Survey
    • Sliding Window-based Method
    • Proposal Generated Method
    • Anchor-based Method
      • Standard Anchor-based Method
      • 2D-Map Anchor-based Method
    • Regression-based Method
    • Span-based Method
    • Reinforcement Leaning-based Method
    • Other Supervised Method
    • Weakly-supervised TSGV Method
      • Multi-Instance Learning Method
      • Reconstruction-based Method
      • Other Weakly-supervised Method
  • 参考

数据集

Dataset Video Source Domain
TACoS Kitchen Cooking
Charades-STA Homes Indoor Activity
ActivityNet Captions Youtube Open
DiDeMo Flickr Open
MAD Movie Open

相关工作

Survey

Sliding Window-based Method

Sliding window-based method adopts a multi-scale sliding windows (SW) to generate proposal candidates.

Proposal Generated Method

Proposal generated (PG) method alleviates the computation burden of SW-based methods and generates proposals conditioned on the query.

Anchor-based Method

Anchor-based methods incorporates proposal generation into answer prediction and maintains the proposals with various learning modules.

Standard Anchor-based Method

2D-Map Anchor-based Method

Regression-based Method

Regression-based method computes a time pair ($t_s$, $t_e$) and compares the computed pair with ground-truth ($τ_s$, $τ_e$) for model optimization.

Span-based Method

Span-based methods aim to predict the probability of each video snippet/frame being the start and end positions of target moment.

Reinforcement Leaning-based Method

RL-based method formulates TSGV as a sequence decision making problem, and utilizes deep reinforcement learning techniques to solve it.

Other Supervised Method

Weakly-supervised TSGV Method

Under weakly-supervised setting, TSGV methods only need video-query pairs but not the annotations of starting/end time.

Multi-Instance Learning Method

Reconstruction-based Method

Other Weakly-supervised Method

参考

Survey of Zhang et al