Aditya Malte comments

Results 43 comments of


                                            Aditya Malte

[how-to-train] Link to a Google Colab version of the blogpost

Did you call from_pretrained using a BertTokenizer object or a PretrainedTokenizer object?

[how-to-train] Link to a Google Colab version of the blogpost

Hmm, that’s strange. What is your version of Transformers and Tokenizers. Why use a cache_dir btw? If you’re not downloading from S3

[how-to-train] Link to a Google Colab version of the blogpost

Wait, so if you’re not running it on Colab, with all other things remaining the same. I think it might be an issue with your environment. Also, just yesterday (or...

[how-to-train] Link to a Google Colab version of the blogpost

Great!

[how-to-train] Link to a Google Colab version of the blogpost

I so strongly agree with you, and I too feel that the community should go in an OOP direction.(Rather than the CLI way, that we’re all using abstractions now) Do...

[how-to-train] Link to a Google Colab version of the blogpost

Hi, Just change the config variable in this colab notebook to adjust number of layers. https://gist.github.com/aditya-malte/2d4f896f471be9c38eb4d723a710768b#file-smallberta_pretraining-ipynb Thanks

Added Colab TPU support with Colab Notebook and modified repo

Hello, I have uploaded the final working version after making MODEL/OUTPUT_DIR separate. Please check Thank you

Added Colab TPU support with Colab Notebook and modified repo

Hello @kimiyoung, Yes, I'll make the changes shortly and update you on it

Added Colab TPU support with Colab Notebook and modified repo

Hello @kimiyoung , I have made the requisite changes the you mentioned and also added IMDB as an example. Its running successfully, please check. Thank you

Added Colab TPU support with Colab Notebook and modified repo

Working perfectly for IMDB dataset for max_seq =128 and batch_size 64. Currently testing how far I can push the Colab TPU by increasing max_seq and/or batch_size