Aditya Malte

Results 43 comments of Aditya Malte

Did you call from_pretrained using a BertTokenizer object or a PretrainedTokenizer object?

Hmm, that’s strange. What is your version of Transformers and Tokenizers. Why use a cache_dir btw? If you’re not downloading from S3

Wait, so if you’re not running it on Colab, with all other things remaining the same. I think it might be an issue with your environment. Also, just yesterday (or...

I so strongly agree with you, and I too feel that the community should go in an OOP direction.(Rather than the CLI way, that we’re all using abstractions now) Do...

Hi, Just change the config variable in this colab notebook to adjust number of layers. https://gist.github.com/aditya-malte/2d4f896f471be9c38eb4d723a710768b#file-smallberta_pretraining-ipynb Thanks

Hello, I have uploaded the final working version after making MODEL/OUTPUT_DIR separate. Please check Thank you

Hello @kimiyoung, Yes, I'll make the changes shortly and update you on it

Hello @kimiyoung , I have made the requisite changes the you mentioned and also added IMDB as an example. Its running successfully, please check. Thank you

Working perfectly for IMDB dataset for max_seq =128 and batch_size 64. Currently testing how far I can push the Colab TPU by increasing max_seq and/or batch_size