mmaction2 icon indicating copy to clipboard operation
mmaction2 copied to clipboard

[Docs] How to fine tune BMN when using ActionNet data prep method 2

Open valentin-fngr opened this issue 1 year ago • 2 comments

The doc issue

Hi, I am trying to fine tune BMN on my custom dataset. I know this has been mentioned already in the issues, but I could not use the previous issues posted here to help me find a way to solve my problem.

Data Prep

I have prepared my custom dataset following the activitynet data preparation methode 2.

At the end, I obtain this exact structure :

(if Option 2 used)
│   │   ├── anet_train_video.txt
│   │   ├── anet_val_video.txt
│   │   ├── anet_train_clip.txt
│   │   ├── anet_val_clip.txt
│   │   ├── activity_net.v1-3.min.json
│   │   ├── mmaction_feat
│   │   │   ├── v___c8enCfzqw.csv
│   │   │   ├── v___dXUJsj3yo.csv
│   │   │   ├── ..
│   │   ├── rawframes
│   │   │   ├── v___c8enCfzqw
│   │   │   │   ├── img_00000.jpg
│   │   │   │   ├── flow_x_00000.jpg
│   │   │   │   ├── flow_y_00000.jpg
│   │   │   │   ├── ..
│   │   │   ├── ..

For fine-tuning BMN :

  • I need to modify the config file. I use RawframeDataset instead of ActivityNetDataset. Am I correct ? (if not please explain)
  • When using RawframeDataset, What should my pipeline look like ? The current pipline looks like this :
train_pipeline = [
    dict(type='LoadLocalizationFeature'),
    dict(type='GenerateLocalizationLabels'),
    dict(
        type='PackLocalizationInputs',
        keys=('gt_bbox', ),
        meta_keys=('video_name', ))
]

which cannot work with RawframeDataset. How can I replicate that pipline with RawframeDataset ?

Please, provide any information that can help me successfully fine tune the model using Data preparation method number 2 for ActivityNet. This is very confusing and would love to propose an overall tutorial once I will be successful.

best,

Valentin

Suggest a potential alternative/fix

No response

valentin-fngr avatar Feb 19 '24 16:02 valentin-fngr

Hello @valentin-fngr Can I ask you that how can you create your own custom dataset which has the same structure as ActivityNet because I am working on a Temporal Action Localization project but I cannot recreate my own custom data to have the same structure as ActivityNet. Thank you so much!

Perceval-Wilhelm avatar Mar 20 '24 04:03 Perceval-Wilhelm

Hi @valentin-fngr,can you share the detail of data preparation? I got an issue when I extract the feature of my own dataset.

PopGreen69 avatar May 08 '24 09:05 PopGreen69

@sirrtt @PopGreen69 HI both. I gave up on that because it was way too complex to setup. I instead went for using a classic TSN recognition network with a sliding window pipeline. You can check there demo/long_video_demo.py where they demonstrate how to detect actions on a long video format. There, the setup is much easier.

valentin-fngr avatar May 12 '24 09:05 valentin-fngr