TALLFormer Question about TemporalRandomCrop

Hi, I have a question about the implementation of the transform "TemporalRandomCrop" defined in the pipelines module. I'm trying TALLFormer on another datasets, so maybe the issue is related to this, but I noticed that sometimes when the TemporalRandomCrop is performed, it found no useful overlaps between segments and thus, produces an empty segment that will be passed to the model. Also, trying to decrease the iof_th (i.e. 0.25) reduces the rates of empty segments but I think reducing so much this threshold would not be beneficial (please correct me if I'm wrong). Is this behaviour still correct or would it be better if a useful segment was always found?

Thanks you.

May 03 '23 10:05 SimoLoca

@klauscc

May 25 '23 14:05 SimoLoca

Hi @SimoLoca , sorry for the late response. I directly take "TemporalRandomCrop" from DaoTAD but I took a look at its implementation. One solution could be add some check to ensure there at least some segments not empty as commented out in L413

if count < 20 and np.count_nonzero(mask) == 0:
    continue
elif count >= 20:
    # need to handle this. either by increasing count limit or directly sample a start near a gt_segment.
    print(results["video_info"], results["ann_info"], results["gt_segments"])

May 30 '23 06:05 klauscc

Hi @klauscc , thanks for the suggestion! At the moment I tried using a loop until a valid segments has not been found and it seems to work. I have other 3 questions:

How to run the code without distributed computation, thus on a single gpu? I set launcher to none (here) and run the training with the command: tools/dist_trainval.sh $config "0" --workdir $workdir but I get this error: RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
~~Without the changes described in point 1, but with num_frames set at 64 or 128 (in the config file), when I run the command tools/dist_trainval.sh $config "0" --workdir $workdir I get this error: ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1]). From the error log seems to be related with the fpn.py file at L181 and L182. Instead, if I choose 256 as num_frames the training proceeds, how can i solve this?~~ Solved using a batch size > 1
I would like to try an anchor-free model for my dataset, besides adding the code relative to the model itself, what else do I need to change?

Thanks you so much!

May 30 '23 15:05 SimoLoca

Hi @SimoLoca , sorry for the late. Hope you already solved these issues!

I think tools/dist_trainval.sh $config "0" --workdir $workdir should work. You don't need to change the code. It will still run in distributed mode but only on 1 GPU.
It's weird but it's great that you solved it.
Yes, you also need to change the dataloader that loads the features from memory bank.

Jun 27 '23 18:06 klauscc

Hi, thanks for the reply. Regarding question 3, can you please explain me why should I need to change that part of the code?

Jun 27 '23 21:06 SimoLoca

Hi brother, I have a problem similar to yours, do you know how to solve it?

024-06-24 19:25:01,289 - vedatad - INFO - Loading weights from /media/TALLFormer-main/thumos14/pretrained_models/vswin/swin_base_patch244_window877_kinetics400_22k_keysfrom_backbone.pth
2024-06-24 19:25:04,457 - vedatad - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: cls_head.fc_cls.weight, cls_head.fc_cls.bias

missing keys in source state_dict: backbone.layers.0.blocks.0.dummy_tensor, backbone.layers.0.blocks.1.dummy_tensor, backbone.layers.1.blocks.0.dummy_tensor, backbone.layers.1.blocks.1.dummy_tensor, backbone.layers.2.blocks.0.dummy_tensor, backbone.layers.2.blocks.1.dummy_tensor, backbone.layers.2.blocks.2.dummy_tensor, backbone.layers.2.blocks.3.dummy_tensor, backbone.layers.2.blocks.4.dummy_tensor, backbone.layers.2.blocks.5.dummy_tensor, backbone.layers.2.blocks.6.dummy_tensor, backbone.layers.2.blocks.7.dummy_tensor, backbone.layers.2.blocks.8.dummy_tensor, backbone.layers.2.blocks.9.dummy_tensor, backbone.layers.2.blocks.10.dummy_tensor, backbone.layers.2.blocks.11.dummy_tensor, backbone.layers.2.blocks.12.dummy_tensor, backbone.layers.2.blocks.13.dummy_tensor, backbone.layers.2.blocks.14.dummy_tensor, backbone.layers.2.blocks.15.dummy_tensor, backbone.layers.2.blocks.16.dummy_tensor, backbone.layers.2.blocks.17.dummy_tensor, backbone.layers.3.blocks.0.dummy_tensor, backbone.layers.3.blocks.1.dummy_tensor, neck.0.conv1.weight, neck.0.conv1.bias, neck.0.conv2.weight, neck.0.conv2.bias, neck.1.encoder.layers.0.norm1.weight, neck.1.encoder.layers.0.norm1.bias, neck.1.encoder.layers.0.attn.relative_position_bias_table, neck.1.encoder.layers.0.attn.relative_position_index, neck.1.encoder.layers.0.attn.qkv.weight, neck.1.encoder.layers.0.attn.qkv.bias, neck.1.encoder.layers.0.attn.proj.weight, neck.1.encoder.layers.0.attn.proj.bias, neck.1.encoder.layers.0.norm2.weight, neck.1.encoder.layers.0.norm2.bias, neck.1.encoder.layers.0.mlp.fc1.weight, neck.1.encoder.layers.0.mlp.fc1.bias, neck.1.encoder.layers.0.mlp.fc2.weight, neck.1.encoder.layers.0.mlp.fc2.bias, neck.1.encoder.layers.1.norm1.weight, neck.1.encoder.layers.1.norm1.bias, neck.1.encoder.layers.1.attn.relative_position_bias_table, neck.1.encoder.layers.1.attn.relative_position_index, neck.1.encoder.layers.1.attn.qkv.weight, neck.1.encoder.layers.1.attn.qkv.bias, neck.1.encoder.layers.1.attn.proj.weight, neck.1.encoder.layers.1.attn.proj.bias, neck.1.encoder.layers.1.norm2.weight, neck.1.encoder.layers.1.norm2.bias, neck.1.encoder.layers.1.mlp.fc1.weight, neck.1.encoder.layers.1.mlp.fc1.bias, neck.1.encoder.layers.1.mlp.fc2.weight, neck.1.encoder.layers.1.mlp.fc2.bias, neck.1.encoder.layers.2.norm1.weight, neck.1.encoder.layers.2.norm1.bias, neck.1.encoder.layers.2.attn.relative_position_bias_table, neck.1.encoder.layers.2.attn.relative_position_index, neck.1.encoder.layers.2.attn.qkv.weight, neck.1.encoder.layers.2.attn.qkv.bias, neck.1.encoder.layers.2.attn.proj.weight, neck.1.encoder.layers.2.attn.proj.bias, neck.1.encoder.layers.2.norm2.weight, neck.1.encoder.layers.2.norm2.bias, neck.1.encoder.layers.2.mlp.fc1.weight, neck.1.encoder.layers.2.mlp.fc1.bias, neck.1.encoder.layers.2.mlp.fc2.weight, neck.1.encoder.layers.2.mlp.fc2.bias, neck.2.pe.pe, neck.2.reductions.0.0.weight, neck.2.reductions.0.1.weight, neck.2.reductions.0.1.bias, neck.2.reductions.1.0.weight, neck.2.reductions.1.1.weight, neck.2.reductions.1.1.bias, neck.2.reductions.2.0.weight, neck.2.reductions.2.1.weight, neck.2.reductions.2.1.bias, neck.2.reductions.3.0.weight, neck.2.reductions.3.1.weight, neck.2.reductions.3.1.bias, neck.2.trans_layers.0.self_attn.in_proj_weight, neck.2.trans_layers.0.self_attn.in_proj_bias, neck.2.trans_layers.0.self_attn.out_proj.weight, neck.2.trans_layers.0.self_attn.out_proj.bias, neck.2.trans_layers.0.linear1.weight, neck.2.trans_layers.0.linear1.bias, neck.2.trans_layers.0.linear2.weight, neck.2.trans_layers.0.linear2.bias, neck.2.trans_layers.0.norm1.weight, neck.2.trans_layers.0.norm1.bias, neck.2.trans_layers.0.norm2.weight, neck.2.trans_layers.0.norm2.bias, neck.2.trans_layers.1.self_attn.in_proj_weight, neck.2.trans_layers.1.self_attn.in_proj_bias, neck.2.trans_layers.1.self_attn.out_proj.weight, neck.2.trans_layers.1.self_attn.out_proj.bias, neck.2.trans_layers.1.linear1.weight, neck.2.trans_layers.1.linear1.bias, neck.2.trans_layers.1.linear2.weight, neck.2.trans_layers.1.linear2.bias, neck.2.trans_layers.1.norm1.weight, neck.2.trans_layers.1.norm1.bias, neck.2.trans_layers.1.norm2.weight, neck.2.trans_layers.1.norm2.bias, neck.2.trans_layers.2.self_attn.in_proj_weight, neck.2.trans_layers.2.self_attn.in_proj_bias, neck.2.trans_layers.2.self_attn.out_proj.weight, neck.2.trans_layers.2.self_attn.out_proj.bias, neck.2.trans_layers.2.linear1.weight, neck.2.trans_layers.2.linear1.bias, neck.2.trans_layers.2.linear2.weight, neck.2.trans_layers.2.linear2.bias, neck.2.trans_layers.2.norm1.weight, neck.2.trans_layers.2.norm1.bias, neck.2.trans_layers.2.norm2.weight, neck.2.trans_layers.2.norm2.bias, neck.2.trans_layers.3.self_attn.in_proj_weight, neck.2.trans_layers.3.self_attn.in_proj_bias, neck.2.trans_layers.3.self_attn.out_proj.weight, neck.2.trans_layers.3.self_attn.out_proj.bias, neck.2.trans_layers.3.linear1.weight, neck.2.trans_layers.3.linear1.bias, neck.2.trans_layers.3.linear2.weight, neck.2.trans_layers.3.linear2.bias, neck.2.trans_layers.3.norm1.weight, neck.2.trans_layers.3.norm1.bias, neck.2.trans_layers.3.norm2.weight, neck.2.trans_layers.3.norm2.bias, neck.3.lateral_convs.0.conv.weight, neck.3.lateral_convs.0.bn.weight, neck.3.lateral_convs.0.bn.bias, neck.3.lateral_convs.0.bn.running_mean, neck.3.lateral_convs.0.bn.running_var, neck.3.lateral_convs.1.conv.weight, neck.3.lateral_convs.1.bn.weight, neck.3.lateral_convs.1.bn.bias, neck.3.lateral_convs.1.bn.running_mean, neck.3.lateral_convs.1.bn.running_var, neck.3.lateral_convs.2.conv.weight, neck.3.lateral_convs.2.bn.weight, neck.3.lateral_convs.2.bn.bias, neck.3.lateral_convs.2.bn.running_mean, neck.3.lateral_convs.2.bn.running_var, neck.3.lateral_convs.3.conv.weight, neck.3.lateral_convs.3.bn.weight, neck.3.lateral_convs.3.bn.bias, neck.3.lateral_convs.3.bn.running_mean, neck.3.lateral_convs.3.bn.running_var, neck.3.lateral_convs.4.conv.weight, neck.3.lateral_convs.4.bn.weight, neck.3.lateral_convs.4.bn.bias, neck.3.lateral_convs.4.bn.running_mean, neck.3.lateral_convs.4.bn.running_var, neck.3.fpn_convs.0.conv.weight, neck.3.fpn_convs.0.bn.weight, neck.3.fpn_convs.0.bn.bias, neck.3.fpn_convs.0.bn.running_mean, neck.3.fpn_convs.0.bn.running_var, neck.3.fpn_convs.1.conv.weight, neck.3.fpn_convs.1.bn.weight, neck.3.fpn_convs.1.bn.bias, neck.3.fpn_convs.1.bn.running_mean, neck.3.fpn_convs.1.bn.running_var, neck.3.fpn_convs.2.conv.weight, neck.3.fpn_convs.2.bn.weight, neck.3.fpn_convs.2.bn.bias, neck.3.fpn_convs.2.bn.running_mean, neck.3.fpn_convs.2.bn.running_var, neck.3.fpn_convs.3.conv.weight, neck.3.fpn_convs.3.bn.weight, neck.3.fpn_convs.3.bn.bias, neck.3.fpn_convs.3.bn.running_mean, neck.3.fpn_convs.3.bn.running_var, neck.3.fpn_convs.4.conv.weight, neck.3.fpn_convs.4.bn.weight, neck.3.fpn_convs.4.bn.bias, neck.3.fpn_convs.4.bn.running_mean, neck.3.fpn_convs.4.bn.running_var, head.cls_convs.0.conv.weight, head.cls_convs.0.bn.weight, head.cls_convs.0.bn.bias, head.cls_convs.0.bn.running_mean, head.cls_convs.0.bn.running_var, head.cls_convs.1.conv.weight, head.cls_convs.1.bn.weight, head.cls_convs.1.bn.bias, head.cls_convs.1.bn.running_mean, head.cls_convs.1.bn.running_var, head.cls_convs.2.conv.weight, head.cls_convs.2.bn.weight, head.cls_convs.2.bn.bias, head.cls_convs.2.bn.running_mean, head.cls_convs.2.bn.running_var, head.cls_convs.3.conv.weight, head.cls_convs.3.bn.weight, head.cls_convs.3.bn.bias, head.cls_convs.3.bn.running_mean, head.cls_convs.3.bn.running_var, head.reg_convs.0.conv.weight, head.reg_convs.0.bn.weight, head.reg_convs.0.bn.bias, head.reg_convs.0.bn.running_mean, head.reg_convs.0.bn.running_var, head.reg_convs.1.conv.weight, head.reg_convs.1.bn.weight, head.reg_convs.1.bn.bias, head.reg_convs.1.bn.running_mean, head.reg_convs.1.bn.running_var, head.reg_convs.2.conv.weight, head.reg_convs.2.bn.weight, head.reg_convs.2.bn.bias, head.reg_convs.2.bn.running_mean, head.reg_convs.2.bn.running_var, head.reg_convs.3.conv.weight, head.reg_convs.3.bn.weight, head.reg_convs.3.bn.bias, head.reg_convs.3.bn.running_mean, head.reg_convs.3.bn.running_var, head.retina_cls.weight, head.retina_cls.bias, head.retina_reg.weight, head.retina_reg.bias
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

Jun 24 '24 11:06 Gi-gigi

TALLFormer TALLFormer copied to clipboard

Question about TemporalRandomCrop

TALLFormer
TALLFormer copied to clipboard