CLIP4Clip icon indicating copy to clipboard operation
CLIP4Clip copied to clipboard

What do Pair, L, T stand for in the code?

Open nhw649 opened this issue 2 years ago • 3 comments

Hi, I'm a beginner and would like to ask a question. What do Pair, L, T stand for in the code? What do they mean?

# Pair x L x T x 3 x H x W
video = np.zeros((len(s), self.max_frames, 1,
                  3, self.rawVideoExtractor.size, self.rawVideoExtractor.size), dtype=np.float)

nhw649 avatar Jun 21 '23 16:06 nhw649

As far as I understand:

  • Pair: number of samples (videos)
  • L: length of video in frames
  • T: time axis? It's always 1 rendering it practically unnecessary...
  • 3: RGB channels
  • H: height
  • W: width

Happy to hear thoughts of other members who have been using this repo!

isabella-karabasz avatar Aug 16 '23 15:08 isabella-karabasz

As far as I understand:

  • Pair: number of samples (videos)
  • L: length of video in frames
  • T: time axis? It's always 1 rendering it practically unnecessary...
  • 3: RGB channels
  • H: height
  • W: width

Happy to hear thoughts of other members who have been using this repo!

T is a scalar representing a set of frames extracted from the same one-second-interval. With an fps of 30, each group of 30 frames will share the same T value. I guess that this will play a role in the frame extraction strategy.

zsnoob avatar Nov 22 '23 14:11 zsnoob

As far as I understand:

  • Pair: number of samples (videos)
  • L: length of video in frames
  • T: time axis? It's always 1 rendering it practically unnecessary...
  • 3: RGB channels
  • H: height
  • W: width

Happy to hear thoughts of other members who have been using this repo!

thank you very much!

nhw649 avatar Nov 22 '23 14:11 nhw649