temporal-shift-module icon indicating copy to clipboard operation
temporal-shift-module copied to clipboard

How to set up the correct combination of shift_div and frame_count?

Open qwangku opened this issue 2 years ago • 1 comments

Thanks for sharing this great resources. I am trying to play with different frame rates for TSM. I noticed there are 3 important attributes here: frame_count, num_segments and shift_div.

For example, if I reduced frame_count from 8 to 4 (which means the video is split into 4 segments this time, so the equivalent frame rate is reduced), should I also adjust "shift_div" and "num_segments"? Am I right to say "shift_div" should always be equal or smaller than "frame_count"?

qwangku avatar Aug 07 '22 23:08 qwangku

I don't know if you are still looking for an answer. I was also looking at the code to get an answer for this question, and I end up to this part of code which I believe answers the question.

    @staticmethod
    def shift(x, n_segment, fold_div=3, inplace=False):
        nt, c, h, w = x.size()
        n_batch = nt // n_segment
        x = x.view(n_batch, n_segment, c, h, w)

        fold = c // fold_div
        if inplace:
            # Due to some out of order error when performing parallel computing. 
            # May need to write a CUDA kernel.
            raise NotImplementedError  
            # out = InplaceShift.apply(x, fold)
        else:
            out = torch.zeros_like(x)
            out[:, :-1, :fold] = x[:, 1:, :fold]  # shift left
            out[:, 1:, fold: 2 * fold] = x[:, :-1, fold: 2 * fold]  # shift right
            out[:, :, 2 * fold:] = x[:, :, 2 * fold:]  # not shift

        return out.view(nt, c, h, w)

fold_div is equal to shift_div. If it is set to 3, then 2 / 3 of the channels will be shifted. If set to 8, then 2 / 8. I am studying this code as well, so please take this with a grain of salt 😄

yjang43 avatar Sep 04 '22 17:09 yjang43