Chujie Zheng

Results 9 comments of Chujie Zheng

多谢建议~ 我已经参照网上的教程把我的台式机顺利升到了Monterey,主要步骤是: - 定制usb - 更新驱动 - 更新opencore和config.plist文件

HDMI+DVI双屏感觉不太稳定,准备淘个免驱卡接双DP了

Because GPT is a uni-directional language model. It does not need attention mask.

Same here. Is there any solution?

I guess mistral before SFT is just a base language model, the chat template is used for the instruction-tuned version (mistral-instruct). On the other hand, the LLM leaderboard does not...

检查下路径,看起来像是你的bash脚本里no_repeat_ngram_size参数后没有留空格

Also found that 4*gpus tp is much slower than 2*gpus tp, while the latter is still a bit faster than 2*gpus pp.