CLIP4Clip What do Pair, L, T stand for in the code?

What do Pair, L, T stand for in the code?

Open nhw649 opened this issue 2 years ago • 3 comments

Hi, I'm a beginner and would like to ask a question. What do Pair, L, T stand for in the code? What do they mean?

# Pair x L x T x 3 x H x W
video = np.zeros((len(s), self.max_frames, 1,
                  3, self.rawVideoExtractor.size, self.rawVideoExtractor.size), dtype=np.float)

Jun 21 '23 16:06 nhw649

As far as I understand:

Pair: number of samples (videos)
L: length of video in frames
T: time axis? It's always 1 rendering it practically unnecessary...
3: RGB channels
H: height
W: width

Happy to hear thoughts of other members who have been using this repo!

Aug 16 '23 15:08 isabella-karabasz

As far as I understand:

Pair: number of samples (videos)

L: length of video in frames

T: time axis? It's always 1 rendering it practically unnecessary...

3: RGB channels

H: height

W: width

Happy to hear thoughts of other members who have been using this repo!

T is a scalar representing a set of frames extracted from the same one-second-interval. With an fps of 30, each group of 30 frames will share the same T value. I guess that this will play a role in the frame extraction strategy.

Nov 22 '23 14:11 zsnoob

As far as I understand:

Pair: number of samples (videos)

L: length of video in frames

T: time axis? It's always 1 rendering it practically unnecessary...

3: RGB channels

H: height

W: width

Happy to hear thoughts of other members who have been using this repo!

thank you very much!

Nov 22 '23 14:11 nhw649

CLIP4Clip CLIP4Clip copied to clipboard

What do Pair, L, T stand for in the code?

CLIP4Clip
CLIP4Clip copied to clipboard