Guofeng Yi comments

Results 51 comments of


                                            Guofeng Yi

大佬好，请教一下，如果做增量预训练呢？

1. 官方仓库目前只支持SFT，目前有很多优秀的[微调框架](https://github.com/hiyouga/LLaMA-Factory)支持Yi的增量预训练。 2. 增量预训练用的就是无监督数据，可以直接输入，可以参考下上面推荐的微调框架里数据是怎么样的 3. 如上

[feat] Add finetune code for Yi-VL model

@minlik Thank you for your PR, I will test it

[feat] Add finetune code for Yi-VL model

Can you provide your environment? Both the official configuration of llava and the requirements. txt you provided reported errors. It would be great if you could provide a step in...

[feat] Add finetune code for Yi-VL model

my sh script: ``` PYTHONPATH=../../:$PYTHONPATH \ deepspeed --include localhost:0,1 --master_port 1234 llava/train/train_mem.py \ --deepspeed ./scripts/zero2.json \ --lora_enable True \ --model_name_or_path /ML-A100/public/tmp/pretrain_weights/Yi-VL-6B \ --data_path /ML-A100/public/tmp/yiguofeng/contribute/Yi/data.json \ --image_folder /ML-A100/public/tmp/yiguofeng/contribute/Yi/VL/images \ --vision_tower /ML-A100/public/tmp/pretrain_weights/Yi-VL-34B/vit/clip-vit-H-14-laion2B-s32B-b79K-yi-vl-34B-448...

Unreal 5.3 support

Does anyone know how to establish a connection between UMG and JS? #342 I hope the design of the UMG interface is completed in the blueprint, but the corresponding logic...

Title 使用deepspeed训练Yi-34B 32K 以及200K上下文爆显存的问题

我们使用megatron在6nodes*8GPUs/node*A800 训练的200K-Yi-34B，不会提供现成的代码。

Title 使用deepspeed训练Yi-34B 32K 以及200K上下文爆显存的问题

此外你可以尝试一下deepspeed zero3，把模型参数进行拆分到不同的GPU上，看看能不能跑起来

Title 使用deepspeed训练Yi-34B 32K 以及200K上下文爆显存的问题

可以参考一下我们的技术报告看看有没有什么细节，最低要求是6nodes8GPUs/nodeA800 ，3840G，具体的参数设置是保密的不会提供

领域增量预训练超参怎么设置效果才能变好呢

目前技术报告还在完善中，后续会公开的。关于领域增量预训练超参怎么设置效果才能变好这个问题：首先，你应该确定你增量预训练后的模型有没有学到你领域内的知识，然后通用知识有没有出现遗忘。公开测试集只能反应一方面，更重要的是实测结果，你还可以制作一个你领域内的测试集来测试你每次训练后的模型。其次，增量预训练的超参设置网上有很多实践的答案，你可以[参考一下](https://www.cnblogs.com/Revelation/p/17787079.html)，我认为比较重要的其实是你数据的质量

VL 6B web_demo gradio api

web_demo is only for display purposes and does not support call Yi-VL-6B through the gradio api