Guofeng Yi comments

Results 51 comments of


                                            Guofeng Yi

您好，具体的细节我也不清楚，需要咨询预训练部门的同事，但是有关paper中ICL这一段的内容描述的还是比较清楚的，我的理解如下图： ![image](https://github.com/01-ai/Yi/assets/66633207/9de1c5c9-928c-4ed9-8847-6195e108f9a4) 结论就是Yi-34B 在ICL上的性能是不错的，而且如果进一步scale up模型规模可以让其通过ICL来推断更复杂的函数。可能我的理解也有错误，如果您还有问题可以进一步提出我可以去帮您咨询一下。

论文中评测ICL能力具体是怎么做的

抱歉，目前不会

使用 Lora 微调模型时无法设置 alpha 和 dropout，影响训练效果

感谢你指出的问题，我们会考虑为微调脚本加上alpha和droppout超参。同时我们也建议你去使用现有的支持Yi模型微调的框架([llama-factory](https://github.com/hiyouga/LLaMA-Factory), [firefly](https://github.com/yangjianxin1/Firefly), [swift](https://github.com/modelscope/swift))，这些框架相较于我们的SFT代码有更多功能。

poor text handling

The current version of the model does not have enhancements for OCR capabilities at the moment.

想请教一下，chat模型，目前是支持默认4k tokens长度，怎么样才可以开启外推，获取更多tokens长度

以Dynamic-NTK的方式外推为例,你可以在config.json文件中设置"rope_scaling": {"type": "dynamic", "factor": 4.0}, 参考这里的代码：https://github.com/huggingface/transformers/blob/b382a09e28c7e59129246ccdf4b00f2cac4547a4/src/transformers/models/llama/modeling_llama.py#L293。你还可以通过[LEval](https://github.com/OpenLMLab/LEval)进行测试

Guofeng Yi

词表大小和embedding层大小不匹配？

论文中评测ICL能力具体是怎么做的

论文中评测ICL能力具体是怎么做的

使用 Lora 微调模型时无法设置 alpha 和 dropout，影响训练效果

poor text handling

想请教一下，chat模型，目前是支持默认4k tokens长度，怎么样才可以开启外推，获取更多tokens长度

想请教一下，chat模型，目前是支持默认4k tokens长度，怎么样才可以开启外推，获取更多tokens长度

The training of yi-vl models is supported by SWIFT Framework of ModelScope community.

Unexpected output when playing with Yi-34B on huggingface chat

supporting local deployment of OpenAI-Compatible Server