Allen

Results 23 issues of Allen

When I try to modify the original finetune.py script to conduct full tuning, which return an error like below: ![image](https://user-images.githubusercontent.com/33925232/232767709-ece24c4e-78c6-47ba-a41b-461d4ac3e24b.png) I comment everything related to peft except `model=prepare_model_for_int8_training(model)` and ```...

![image](https://user-images.githubusercontent.com/33925232/233402405-fe532a35-61bb-419a-b112-43a465f15118.png) `python generate.py` ![image](https://user-images.githubusercontent.com/33925232/233402505-b4f94b4f-20bc-47fa-b80f-65322f1fa693.png) transformers: 4.28.0

It seems to have a bug in evaluate function as shown in following: Since it only caculate the metric of last batch in the evaluation set, it maybe alter to...

您好,感谢您贡献的代码。我跑了一下cnews10在GSM上的代码,有一个疑问是,KL散度消失(基本为0)而且主题发现效果很差,请问这是实现上的问题吗,我使用的是默认参数? ![image](https://user-images.githubusercontent.com/33925232/143544764-6169c782-1b4d-4e35-a71e-36806b5262a6.png) topic diversity:0.03866666666666667 c_v:0.7579875287637481, c_w2v:None, c_uci:-18.122450398623315, c_npmi:-0.6600369278689214 mimno topic coherence:-326.14847513585073 从TD和NPMI看出模型是有问题的。

How can I implement a mini-batch version of GVAE?

Wonderful job, thanks! Could you please share the code with the building dependency tree?

**Is your feature request related to a problem? Please describe.** GPT implementation by hugging face is different from T5 and Roberta due to it implementing a self-attention calculator in a...

Hi, thanks for your great job. I have been reviewed your paper and codes. You said in the paper that STUNT uses the early-stop to achieve a better performance by...