mmaction2
mmaction2 copied to clipboard
[Docs] How to fine tune BMN when using ActionNet data prep method 2
The doc issue
Hi, I am trying to fine tune BMN on my custom dataset. I know this has been mentioned already in the issues, but I could not use the previous issues posted here to help me find a way to solve my problem.
Data Prep
I have prepared my custom dataset following the activitynet data preparation methode 2.
At the end, I obtain this exact structure :
(if Option 2 used)
│ │ ├── anet_train_video.txt
│ │ ├── anet_val_video.txt
│ │ ├── anet_train_clip.txt
│ │ ├── anet_val_clip.txt
│ │ ├── activity_net.v1-3.min.json
│ │ ├── mmaction_feat
│ │ │ ├── v___c8enCfzqw.csv
│ │ │ ├── v___dXUJsj3yo.csv
│ │ │ ├── ..
│ │ ├── rawframes
│ │ │ ├── v___c8enCfzqw
│ │ │ │ ├── img_00000.jpg
│ │ │ │ ├── flow_x_00000.jpg
│ │ │ │ ├── flow_y_00000.jpg
│ │ │ │ ├── ..
│ │ │ ├── ..
For fine-tuning BMN :
- I need to modify the config file. I use
RawframeDataset
instead ofActivityNetDataset
. Am I correct ? (if not please explain) - When using
RawframeDataset
, What should my pipeline look like ? The current pipline looks like this :
train_pipeline = [
dict(type='LoadLocalizationFeature'),
dict(type='GenerateLocalizationLabels'),
dict(
type='PackLocalizationInputs',
keys=('gt_bbox', ),
meta_keys=('video_name', ))
]
which cannot work with RawframeDataset. How can I replicate that pipline with RawframeDataset ?
Please, provide any information that can help me successfully fine tune the model using Data preparation method number 2 for ActivityNet. This is very confusing and would love to propose an overall tutorial once I will be successful.
best,
Valentin
Suggest a potential alternative/fix
No response
Hello @valentin-fngr Can I ask you that how can you create your own custom dataset which has the same structure as ActivityNet because I am working on a Temporal Action Localization project but I cannot recreate my own custom data to have the same structure as ActivityNet. Thank you so much!
Hi @valentin-fngr,can you share the detail of data preparation? I got an issue when I extract the feature of my own dataset.
@sirrtt @PopGreen69 HI both. I gave up on that because it was way too complex to setup. I instead went for using a classic TSN recognition network with a sliding window pipeline. You can check there demo/long_video_demo.py where they demonstrate how to detect actions on a long video format. There, the setup is much easier.