Thilina Rajapakse comments

Results 57 comments of


Thilina Rajapakse

Extensions

Should be pretty similar to adding custom losses. You can freeze all the layers by setting `requires_grad = False` for all of them in your subclassed model. You can add...

Extensions

If it doesn't work, you can always decouple BERT and the CNN and just feed the BERT outputs to the CNN. I'm no expert myself, but you seem to be...

Extensions

Great! I use the Apex version with C++ extensions. The pure python version is lacking a few features. I don't see any reason not to use the C++ version.

Extensions

Odd. I never had issues with any Ubuntu based distros. Welcome to Pytorch!

Extensions

I don't think I changed batchnorm. Doesn't it get set when you change the opt level? I used opt 1. Opt 2 was giving me NaN losses.

Extensions

Yeah, I just kept the defaults there.

How to run utils.py?

You don't need to run `utils.py`. The `readme` tells you which notebooks to run.

How to run utils.py?

No problem. The stuff in `utils` is used in the next notebook.

Model performance degrades when moved to Multi-GPU

Those changes should be sufficient to enable multi-gpu training in my experience. Is there any other difference (e.g. batch size) between the two runs?

Model performance degrades when moved to Multi-GPU

This is probably a silly question, but did you try this multiple times and receive the same results?