shizhediao issues

Results 37 issues of


                                            shizhediao

[T-DNA] New adapter for easy domain adaptation

# 🌟 New adapter setup ## Model description The ACL 2021 paper [Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation](https://aclanthology.org/2021.acl-long.259.pdf) introduces a **T**ransformer-based **D**omainaware **N**-gram **A**daptor, T-DNA,...

enhancement

It is too slow for irtr

Hi authors, Thanks for your great work! I am trying to reproduce the results but found it is too slow for irtr testing. It seems that it needs 38 hours...

Update Gen_QA.ipynb

Fixed a typo

Evaluating the cost of training

Hi, May I ask how long does it take to train Vicuna-7B and Vicuna-13B? In addition, what is the price for the GPUs you are using? Thank you!

[feat.] support encoder-decoder tuning and ChatGLM, Vicuna inference

Added new features: 1. encoder-decoder architecture fine-tuning (e.g., T5-based model) 2. ChatGLM inference 3. Vicuna inference

deepspeed inf + multi-gpu chatbot

1. update deepspeed inference 2. only main process require input

just for reference

Update README.md

Aadm is slower than Adafactor

Hi, I found that training Transformers with Adam is three times slower than with Adafactor. Here is the command I am using for Adam: ``` t2t-trainer \ --data_dir=./t2t/t2t_data \ --problem=translate_ende_wmt32k...

AttributeError: 'AdafactorOptimizer' object has no attribute 'get_gradients'

Hi, When I am trying to reproduce the adafactor experiments on en-de translation task, I encountered the following issue. `AttributeError: 'AdafactorOptimizer' object has no attribute 'get_gradients'` Could any one tell...

Question about bleu evaluation

Hi, I am a little bit confused why should we set `REFERENCE_TEST_TRANSLATE_DIR=t2t_local_exp_runs_dir_master/t2t_datagen/dev/newstest2014-deen-ref.en.sgm` . because in my mind, the reference should be `de.sgm`. Do you have any idea? Thanks!