Zhibin Gou

Results 5 comments of Zhibin Gou

Thanks for your following. The DuLeMon dataset has already been open-sourced, and please check [this PR](https://github.com/PaddlePaddle/Research/pull/252). And the code is coming soon :)

> Hi there, thanks for writing such an interesting paper. I was wondering if you had any plan to create an English version of the DuLeMon dataset. I also look...

谢谢关注! 开源的数据是全量的数据,不存在你讲的首次对话的原始数据,因为首次交互是**假想**发生的,让记忆中存储了提取好的双方 persona。

Same issue: with Deepspeed Zero Stage3 + Transformers Trainer, we can't correctly save the **final model** weight after training with `trainer.save_model()`. However, we can save the checkpoints during training instead.

Sure. Simply use the official Zero Stage3 config by setting `stage3_gather_16bit_weights_on_model_save` as `true` following [this](https://huggingface.co/docs/transformers/main/main_classes/deepspeed#getting-the-model-weights-out). Then, use the Huggingface Trainer to train a GPT-2 or LLaMA (or any models) with...