Ahmed Elnaggar
Ahmed Elnaggar
Hello, First, congratulations for your work. Second, from what I have discovered so far, you only allow Bert like training and not Roberta training. Even if the NSP is set...
Hello, I am trying to run the jupyter notebook example "introduction.ipynb" on "codelabs" folder using Google Colab. However, it gives me the following error: ``` 2021-05-12 03:12:03.347414: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully...
**Describe the bug** I am debugging an application on an isolated docker container and I am monitoring the memory usage using htop. Unfortunately, scalene is reporting a much less memory...
**Describe the bug** I am trying to debug a python script, but it crashes during the profile generation step. **To Reproduce** Steps to reproduce the behavior: `python -m scalene --cli...
### Description Hello, I am trying to train reformer model using Trax and JAX. The training seems to be fine on Google Colab, but when I run it on google...
TPU deadlock
Hello, I am trying to train reformer model using Trax and JAX. The training fails on Google Colab because of memory limitation. When I run it on google cloud server...
Hello, Any plans to have a script for training XLNet on distributed GPUs? Maybe with Horovod or MultiWorkerMirroredStrategy?
Thanks a lot for your great job. It will be great if you could have an overleaf template. I tried to import it to overleaf but a lot of errors...
**Is your feature request related to a problem? Please describe.** Not related to a problem **Describe the solution you'd like** I have a model that has a Bilinear layer and...
In pytorch implementation there is a mix between para_model and model. Should not you use only para_model? For example in training function line number 422 you used "model.zero_grad()" but afterward...