swift icon indicating copy to clipboard operation
swift copied to clipboard

ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 25+ MLLMs

Results 206 swift issues
Sort by recently updated
recently updated
newest added

**Describe the bug** How much VRAM is needed to finetune the 7b VL model? ```bash # Experimental Environment: A100 # GPU Memory Requirement: 80GB # Runtime: 2.5 hours CUDA_VISIBLE_DEVICES=0 \...

**Describe the feature** Please describe the feature requested here(请在这里描述需求) **Paste any useful information** Paste any useful information, including papers, github links, etc.(请在这里描述其他有用的信息,比如相关的论文地址,github链接等) **Additional context** Add any other context or information...

**Describe the bug** 就用的 该脚本做的 sft, 模型可以正确加载,但是 加载数据的时候出错 “TypeError: Value.init() missing 1 required positional argument: 'dtype'” 报错情况为: sft_main() File "/data/anaconda3/envs/cuda12.1/lib/python3.10/site-packages/swift/utils/run_utils.py", line 31, in x_main result = llm_x(args, **kwargs) File "/data/anaconda3/envs/cuda12.1/lib/python3.10/site-packages/swift/llm/sft.py",...

**Describe the bug** What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图) 微调后的llama3-8B拿去量化,4*24G显存都会oom?已经把quant_n_samples和quant_seqlen减小到32/128了 CUDA_VISIBLE_DEVICES=0,1,2,3 swift export \ --model_type llama3-8b-instruct \ --ckpt_dir/home/greatwall/app/edison/output/llama3-8b-instruct/v2-20240427-073919/checkpoint-438-merged \ --quant_bits 4 \ --quant_method awq \...

question

# PR type - [ ] Bug Fix - [ ] New Feature - [ ] Document Updates - [x] More Models or Datasets Support # PR information Write the...

`import os os.environ['CUDA_VISIBLE_DEVICES'] = '3' from modelscope import Model, AutoModelForSequenceClassification, AutoTokenizer, MsDataset from swift import Swift, LoRAConfig, AdapterConfig, Trainer, TrainingArguments, PromptConfig import torch from transformers import default_data_collator model = Model.from_pretrained('/tf/model/Llama-2-7b-chat-ms/',...

在win11环境下命令行报错如下: 'CUDA_VISIBLE_DEVICES' 不是内部或外部命令,也不是可运行的程序 或批处理文件。 建议在swift\swift\ui\llm_train\llm_train.py的251行添加系统判断来设置CUDA_VISIBLE_DEVICES

# PR type - [x] Bug Fix - [x] New Feature - [x] Document Updates - [ ] More Models or Datasets Support # PR information New quantization algorithms: -...

# PR type - [ ] Bug Fix - [Y ] New Feature - [ ] Document Updates - [ ] More Models or Datasets Support # PR information 1,DeepseekVL...