ms-swift icon indicating copy to clipboard operation
ms-swift copied to clipboard

ms-swift3 Suggestion Box

Open Jintao-Huang opened this issue 1 year ago • 44 comments

中文:

  • [x] 弱化model_type的概念, 支持只使用<model_id_or_path>自动检测model_type (config.json).
  • [x] template模块和dataset模块 拥抱messages数据集格式.
    • [x] 去除generation-template的概念. 使用use_generate_template参数来控制获取base model需要的template, 以支持所有多模态模型的CPT.
    • [x] preprocessor模块更加智能. 引入AutoPreprocessor.
  • [x] 支持训练重要功能的定制化, 采用插件化设计, 例如: loss_type, loss_scale, trainer, optimizer, callback, metric.
  • [x] 更强的代码可读性, 层次化设计, 支持不同需求用户从代码、命令行、web-ui对ms-swift进行使用和再开发.
  • [x] 重构文档和examples.
  • [x] 统一的推理与部署接口, 采用类设计支持vllm/lmdeploy/pt/client.
    • [x] pt支持batch
    • [x] pt支持多卡/deepspeed
    • [x] 多lora推理体验优化.
  • [x] 优化encode/post_encode多模态模型训练机制
  • [x] 提升大型预训练时的训练鲁棒性.
  • [x] 优化对其他训练框架全参微调模型继续微调、推理、量化、部署的接入流程.

English:

  • [x] De-emphasize the concept of model_type, supporting automatic detection of model_type using only <model_id_or_path> (config.json).
  • [x] The template module and dataset module embrace the messages dataset format.
    • [x] Remove the concept of generation-template. Use the use_generate_template parameter to control the template needed for acquiring the base model, in order to support the CPT of all multimodal models.
    • [x] Make the preprocessor module smarter. Introduce AutoPreprocessor.
  • [x] Support customization of training functionalities with a plugin design, such as loss_type, loss_scale, trainer, optimizer, callback, metric.
  • [x] Enhance code readability with a hierarchical design, allowing users with different needs to utilize and redevelop ms-swift through code, command line, and web UI.
  • [x] Refactor documentation and examples.
  • [x] Unified inference and deployment interface, utilizing class design to support vllm/lmdeploy/pt/client.
    • [x] PT supports batch
    • [x] PT supports multi-GPU/DeepSpeed
    • [x] Optimization of multi-Lora inference experience.
  • [x] Optimize the training mechanism of the encode/post_encode multimodal model.
  • [x] Enhance the training robustness during large-scale pre-training.
  • [x] Optimize the integration process for continued fine-tuning, inference, quantization, and deployment of full-parameter fine-tuned models with other training frameworks.

Jintao-Huang avatar Oct 10 '24 02:10 Jintao-Huang

非常感谢贵组的辛苦工作! 针对第三点,请问是否有想法加入类似 channel loss 的观察功能呢?即针对不同下游任务的数据集单独观察loss变化趋势。我看2.5版本已经支持了对于 MLLM 的 PT,我想这个功能对于做 MLLM 的 Post Pre-Train 是比较重要的。望采纳 :>

bonre avatar Oct 10 '24 02:10 bonre

您好,非常感谢你们的开源工作!请问后续会有支持RAG的相关计划吗?

EdisonLeeeee avatar Oct 10 '24 03:10 EdisonLeeeee

您好,非常感谢你们的开源工作!请问后续会有支持RAG的相关计划吗?

会有的,但应该不会在3.0加入哈,大概是3.1/3.2左右会加入

Jintao-Huang avatar Oct 10 '24 03:10 Jintao-Huang

您好,可以增加自定义 evaluation 评价指标的相关接口吗

Betty-J avatar Oct 10 '24 04:10 Betty-J

您好,可以增加自定义 evaluation 评价指标的相关接口吗

是的, 这是个重要的功能.

Jintao-Huang avatar Oct 10 '24 05:10 Jintao-Huang

3.0会有对多卡npu适配的完整demo吗

liujiachang avatar Oct 10 '24 06:10 liujiachang

hello,训练流程会有tp、pp...的支持吗

firefighter-eric avatar Oct 10 '24 07:10 firefighter-eric

3.0会有对多卡npu适配的完整demo吗

这个看能不能借到卡😊

Jintao-Huang avatar Oct 11 '24 05:10 Jintao-Huang

Allow to change datasets column names from HuggingFace/ModelScope to a swift supported format. Currently you have to download dataset from Huggingface, change column names and reupload it to use with swift.

Aunali321 avatar Oct 15 '24 10:10 Aunali321

Also swift's dataset preparation is very strict despite using --check_dataset_strategy none. It logs cryptic errors that does not explain what went wrong.

For example, It will not accept a dataset that has User message as a last message. Another example is that it doesn't accept a dataset if it has repeating roles such as Assistant -> Assistant -> User. It also complained about KeyError: 'conversations' in a dataset that didn't have a conversation column at all.

In a large dataset it's impossible to check for each row and fix it. There should be an option to continue despite this issues.

Aunali321 avatar Oct 15 '24 10:10 Aunali321

Also swift's dataset preparation is very strict despite using --check_dataset_strategy none. It logs cryptic errors that does not explain what went wrong.

For example, It will not accept a dataset that has User message as a last message. Another example is that it doesn't accept a dataset if it has repeating roles such as Assistant -> Assistant -> User. It also complained about KeyError: 'conversations' in a dataset that didn't have a conversation column at all.

In a large dataset it's impossible to check for each row and fix it. There should be an option to continue despite this issues.

Great suggestion, thank you!

Jintao-Huang avatar Oct 15 '24 15:10 Jintao-Huang

hello,训练流程会有tp、pp...的支持吗

megatron支持的优化会在ms-swift3.0重构后进行. 大概1-2个月后

Jintao-Huang avatar Oct 15 '24 15:10 Jintao-Huang

非常感谢贵组的辛苦工作! 针对第三点,请问是否有想法加入类似 channel loss 的观察功能呢?即针对不同下游任务的数据集单独观察loss变化趋势。

TengboWang avatar Oct 18 '24 04:10 TengboWang

非常感谢贵组的辛苦工作! 针对第三点,请问是否有想法加入类似 channel loss 的观察功能呢?即针对不同下游任务的数据集单独观察loss变化趋势。

好的 这个需求会加,是很常见的需求 感谢两位

Jintao-Huang avatar Oct 18 '24 04:10 Jintao-Huang

please include the code for end to end fine tuning / pre training for audio lanugage models into your existing pipeline.If possible please integrate the moshi audio language model also EX: Llama 3.1 omni,

satheeshKOLA532 avatar Oct 19 '24 14:10 satheeshKOLA532

channel loss: related issue: https://github.com/modelscope/ms-swift/issues/2220

Jintao-Huang avatar Oct 23 '24 09:10 Jintao-Huang

能否构建一个官方的npu版本的swfit镜像

liujiachang avatar Oct 30 '24 01:10 liujiachang

多模态模型是否可以支持显存均匀分布到多卡

verigle avatar Nov 04 '24 12:11 verigle

多模态模型是否可以支持显存均匀分布到多卡

deepspeed zero2/zero3是均匀的哇

Jintao-Huang avatar Nov 04 '24 12:11 Jintao-Huang

请问在微调got-ocr2.0后,使用保存的微调模型时,报错model_type的问题,该如何解决? image image

Charimanhua avatar Nov 06 '24 11:11 Charimanhua

请问在微调got-ocr2.0后,使用保存的微调模型时,报错model_type的问题,该如何解决?

你需要merge lora. 才会有config.json文件

Jintao-Huang avatar Nov 06 '24 11:11 Jintao-Huang

请问在微调got-ocr2.0后,使用保存的微调模型时,报错model_type的问题,该如何解决?

你需要merge lora. 才会有config.json文件

已解决,谢谢! cd 进微调模型目录下,执行: swift merge-lora --ckpt_dir xxx

Charimanhua avatar Nov 06 '24 13:11 Charimanhua

您好,当前执行多模态大模型微调时,设置的多轮对话在执行 infer 后保存的.jsonl文件中,response 只包含了最后一轮对话的结果,而 history 中包含的历史信息是label的,后续可以支持 infer 后保存多轮对话的全部结果吗

Betty-J avatar Nov 11 '24 07:11 Betty-J

期待早日加入音频大模型的训练和微调,cosyvoice一类的

Ash-one avatar Nov 29 '24 13:11 Ash-one

hello,训练流程会有tp、pp...的支持吗

megatron支持的优化会在ms-swift3.0重构后进行. 大概1-2个月后

今年有希望支持吗 😂

shiningliang avatar Dec 02 '24 03:12 shiningliang

https://github.com/modelscope/ms-swift/pull/2030/

Jintao-Huang avatar Dec 02 '24 03:12 Jintao-Huang

我使用pip安装的默认版本为2.6.1,指定版本安装也无法找到3.0版本,而源码master分支也为2.6.1,请问如何安装3.0版本ms-swift

Nevermore2099 avatar Dec 13 '24 11:12 Nevermore2099

如何download3.0版本的swift包,现在download的包的版本是2.6.1的

Fanxhion avatar Dec 19 '24 06:12 Fanxhion

pip install git+https://github.com/modelscope/ms-swift.git

Jintao-Huang avatar Dec 19 '24 06:12 Jintao-Huang

hello,训练流程会有tp、pp...的支持吗

megatron支持的优化会在ms-swift3.0重构后进行. 大概1-2个月后

今年有希望支持吗 😂

@Jintao-Huang 今年还有机会吗?

Xu-Chen avatar Dec 23 '24 13:12 Xu-Chen