action_recognition icon indicating copy to clipboard operation
action_recognition copied to clipboard

Possible to train in MultiGPU (using DataParallel)?

Open cesarandreslopez opened this issue 4 years ago • 2 comments

Was attempting to train this in multiple GPUs by changing:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SimpleModel(num_classes=dls.c, seq_len=seq_len)
model= nn.DataParallel(model)
model.to(device)

Which returns a 'DataParallel' object has no attribute 'encoder'

when setting up the Learner.

Does anyone here have a sample for training on multiple GPUs?

cesarandreslopez avatar Jun 27 '21 11:06 cesarandreslopez

let me look into this tomorrow. Look at here: https://docs.fast.ai/distributed.html

tcapelle avatar Jun 27 '21 20:06 tcapelle

Wow that's pretty low. It depends one the split of you train/valid. Are you using random default? Set a seed so you are sure to be comparing apple to apples Also ,the times former is tricky to train, try with a linear schedule.

On Fri, Jul 30, 2021, 5:13 PM wenjun90 @.***> wrote:

Hi @tcapelle https://github.com/tcapelle Can you explain why the score of TimesFormer model (acc:77%) is lower than Baseline model (91%), please?

Thank you.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tcapelle/action_recognition/issues/23#issuecomment-889959884, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEMWOAK6TSCYF7WLO6635DTT2K6SPANCNFSM47MJ3VPA .

tcapelle avatar Jul 31 '21 09:07 tcapelle