Results 8 issues of wutaiqiang

实时更新,真的不容易,给你点个赞 Thanks !!!

Great work. I would like to introduce two papers: Name: Weight-Inherited Distillation for Task-Agnostic BERT Compression paper: code: https://github.com/wutaiqiang/WID-NAACL2024 Blog: https://zhuanlan.zhihu.com/p/687294843 TL, DR: 使用权重继承的思路来实现模型压缩, 直接学习一个映射,将教师模型的权重映射到学生模型。 Name: Rethinking Kullback-Leibler Divergence in...

Hey,i noticed that you did not use the positional encoding in the model but the orginal Transformer Model used the triangle positional encoding, why did not you use that ?...

外接显示器以后,貌似不能覆盖到次屏,而且鼠标和键盘也还是可以移动,可以稍微改进下 很好的工具,感谢~

Can't pickle local object '__init__..' Lora在使用ddp的时候,lambda函数无法pickle 建议直接在forward函数加入 if dropout>0的逻辑判断,而不是初始化,进而避免lambda函数的使用。

A pretty nice work. Why not continue ?

How about list the version in the requirements.txt ? such as torch==1.10.1