ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

可以支持mistral模型吗

Open cy565025164 opened this issue 1 year ago • 7 comments

Describe the feature

You are using a model of type mistral to instantiate a model of type llama. This is not supported for all configurations of models and can yield errors

cy565025164 avatar Nov 25 '23 12:11 cy565025164

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Title: Can the mistral model be supported?

Issues-translate-bot avatar Nov 25 '23 12:11 Issues-translate-bot

hi,mistral模型已经支持了,可以关注一下这个pr:https://github.com/hpcaitech/ColossalAI/pull/5103

flybird11111 avatar Nov 26 '23 07:11 flybird11111

hi,mistral模型已经支持了,可以关注一下这个pr:#5103

我的代码如下: torchrun --standalone --nproc_per_node=8 train_sft.py
--pretrain ./zephyr-7b-beta
--tokenizer ./zephyr-7b-beta
--model 'llama'
--strategy colossalai_zero2_cpu
--save_path zephyr-7b-beta-sft
--dataset train_sft.json
--batch_size 1
--accumulation_steps 8
--lr 2e-5
--max_len 4096
--max_epochs 3
--grad_checkpoint

You are using a model of type mistral to instantiate a model of type llama. This is not supported for all configurations of models and can yield errors

cy565025164 avatar Nov 26 '23 09:11 cy565025164

抱歉,shardformer已经支持了mistral,coati还未支持,我跟相关同事反馈一下。

flybird11111 avatar Nov 26 '23 09:11 flybird11111

@flybird11111

抱歉,shardformer已经支持了mistral,coati还未支持,我跟相关同事反馈一下。

shardformer何时支持下QWEN?

imgaojun avatar Nov 28 '23 13:11 imgaojun

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@flybird11111

Sorry, shardformer already supports mistral, but coati does not yet support it. I will give feedback to relevant colleagues.

When will shardformer support QWEN?

Issues-translate-bot avatar Nov 28 '23 13:11 Issues-translate-bot

@flybird11111

抱歉,shardformer已经支持了mistral,coati还未支持,我跟相关同事反馈一下。

shardformer何时支持下QWEN?

hi,qianwen会尽快支持的。chat也会尽快支持mistral

flybird11111 avatar Dec 11 '23 06:12 flybird11111