Xiaoming
Xiaoming
This repo doesn't support adversarial training yet. Normally, for AT, perturbations will be added to input embedding, and based on whether it's AT or VAT, different perturbation updating strategy will...
@Yuqi92 Could you share with me the environment setting (e.g. OS, version, etc.) of your experiment so I can repro the issue? Meanwhile please also check whether submodule code under...
I have repro the issue when xlnet submodule is using the latest commit instead of commit [8f17cf](https://github.com/zihangdai/xlnet/commit/8f17cfd59963a2c68fc76b00ac8f1d66e50b09ee). The latest change has updated xlnet_extension_tf repo to use newer commit [4c83f2f ](https://github.com/zihangdai/xlnet/commit/4c83f2f688a59576eb3b479228a647e5ed3315a2)...
this might be related to issue you're discussing in #79
> > this might be related to issue you're discussing in # > > > this might be related to issue you're discussing in #79 > > Have you also...
Could you reproduce the issue locally? Since it might not be specific for colab On Thu, Sep 10, 2020 at 11:37 AM imagine3D-ai wrote: > > > > Have you...
You can also try to train the model on CPU and see if it still crash On Thu, Sep 10, 2020 at 11:47 AM imagine3D-ai wrote: > > > >...
normally, 4 x 32G GPU memory, you can try base model and use small batch size, and see how it works On Fri, Sep 11, 2020 at 9:09 AM imagine3D-ai...
> > this might be related to issue you're discussing in #79 > > Is max_sequence_length measured in characters or in words? should be subwords
Thanks for the debugging! Yes, the max_seq_length should be larger than max_query_length + 3, let me add the parameter constraints later. On Thu, Apr 22, 2021 at 3:34 PM Stefan...