Thilina Rajapakse comments

Results 57 comments of


Thilina Rajapakse

Model performance degrades when moved to Multi-GPU

Sorry, I am not sure why this is happening. I recommend that you try the Simple Transformers library as it supports multi-gpu training by default and I have used multi-gpu...

How to avoid CUDA out of memory error for large batch sizes?

The code in this repo was not written to support multi-GPU training (mainly because I only have the one). But, the code that this is [based on](https://github.com/huggingface/transformers/blob/master/examples/run_glue.py) does support multi-GPUs....

Validating the model

You can get the training loss without any changes. You can use `tensorboardx` to get a graph of the training loss. The loss information is being written to the 'runs'...

Validating the model

There should be a subdirectory inside runs for every training run. So your command would look like `tensorboard --logdir=runs/subdirectory`. To visualize the last run, you can use the line below....

Validating the model

Great to see you got it to work. I didn't realize you were on Colab!

ERROR:pytorch_transformers.tokenization_utils:Couldn't reach server to download vocabulary.

This repo is no longer actively maintained. Please use [Simple Transformers](https://github.com/ThilinaRajapakse/) instead.

SummaryWriter Import Missing from Gist

The recommended method is to get the code from the Github repo since the code on the article is for demonstration only. I do understand that sometimes you just want...

SummaryWriter Import Missing from Gist

Hope it goes well!

AttributeError: module 'torch.nn.functional' has no attribute 'one_hot'

It looks like your Pytorch is out of date. Can you update it and try again?

AttributeError: module 'torch.nn.functional' has no attribute 'one_hot'

Unfortunately, with no GPU your training speed will be _slow_. I can't remember the total number of steps, but it should be there in the output right before training starts....