shizhediao
shizhediao
# 🌟 New adapter setup ## Model description The ACL 2021 paper [Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation](https://aclanthology.org/2021.acl-long.259.pdf) introduces a **T**ransformer-based **D**omainaware **N**-gram **A**daptor, T-DNA,...
Hi authors, Thanks for your great work! I am trying to reproduce the results but found it is too slow for irtr testing. It seems that it needs 38 hours...
Fixed a typo
Hi, May I ask how long does it take to train Vicuna-7B and Vicuna-13B? In addition, what is the price for the GPUs you are using? Thank you!
Added new features: 1. encoder-decoder architecture fine-tuning (e.g., T5-based model) 2. ChatGLM inference 3. Vicuna inference
1. update deepspeed inference 2. only main process require input
Hi, I found that training Transformers with Adam is three times slower than with Adafactor. Here is the command I am using for Adam: ``` t2t-trainer \ --data_dir=./t2t/t2t_data \ --problem=translate_ende_wmt32k...
Hi, When I am trying to reproduce the adafactor experiments on en-de translation task, I encountered the following issue. `AttributeError: 'AdafactorOptimizer' object has no attribute 'get_gradients'` Could any one tell...
Hi, I am a little bit confused why should we set `REFERENCE_TEST_TRANSLATE_DIR=t2t_local_exp_runs_dir_master/t2t_datagen/dev/newstest2014-deen-ref.en.sgm` . because in my mind, the reference should be `de.sgm`. Do you have any idea? Thanks!