YouYouCoding
Results
2
comments of
YouYouCoding
@Yimi81 你好,感谢解释。想问下,Yi-200K系列的在调大base之后,对长数据进行继续预训练的实际长度能到多少呀?技术报告是这么写的: “To adapt the base model to longer context, we continue pretrain the model on 10B tokens from our pretraining data mixture with slightly upsampled long sequences, mostly from...
@Yimi81 感谢回复,谢谢!