Allen
Allen
When I try to modify the original finetune.py script to conduct full tuning, which return an error like below: data:image/s3,"s3://crabby-images/3093e/3093ee4e8b3e26a6faceb44247e3dda28bba1c56" alt="image" I comment everything related to peft except `model=prepare_model_for_int8_training(model)` and ```...
data:image/s3,"s3://crabby-images/91d0f/91d0f0dde1090155b71404097d0a04cdfcafec07" alt="image" `python generate.py` data:image/s3,"s3://crabby-images/3bafb/3bafb9e091826c54655abc16111a190582da6968" alt="image" transformers: 4.28.0
```[tasklist] ### Tasks ```
It seems to have a bug in evaluate function as shown in following: Since it only caculate the metric of last batch in the evaluation set, it maybe alter to...
您好,感谢您贡献的代码。我跑了一下cnews10在GSM上的代码,有一个疑问是,KL散度消失(基本为0)而且主题发现效果很差,请问这是实现上的问题吗,我使用的是默认参数? data:image/s3,"s3://crabby-images/b3b75/b3b7549f6666176323c870082add053dd143ff04" alt="image" topic diversity:0.03866666666666667 c_v:0.7579875287637481, c_w2v:None, c_uci:-18.122450398623315, c_npmi:-0.6600369278689214 mimno topic coherence:-326.14847513585073 从TD和NPMI看出模型是有问题的。
How can I implement a mini-batch version of GVAE?
Wonderful job, thanks! Could you please share the code with the building dependency tree?
**Is your feature request related to a problem? Please describe.** GPT implementation by hugging face is different from T5 and Roberta due to it implementing a self-attention calculator in a...
Hi, thanks for your great job. I have been reviewed your paper and codes. You said in the paper that STUNT uses the early-stop to achieve a better performance by...