swift icon indicating copy to clipboard operation
swift copied to clipboard

ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 25+ MLLMs

Results 206 swift issues
Sort by recently updated
recently updated
newest added

Hello, 1. we want to finetune the model with our own dataset(pics + txt), and with 2 RTX4090, the following setting leads to Error as follows, does that mean the...

bug

模型:Qwen1.5-110B-Chat-AWQ 执行命令 CUDA_VISIBLE_DEVICES=1,6 swift infer --model_type qwen1half-110b-chat-awq --infer_backend vllm --max_model_len 8192 --model_id_or_path /share/models/Qwen1.5-110B-Chat-AWQ/ 错误: d [INFO:swift] Start time of running main: 2024-04-29 14:33:02.893537 [INFO:swift] ckpt_dir: None [INFO:swift] Due to `ckpt_dir`...

question

**Describe the bug** CUDA_VISIBLE_DEVICES=0,1,2,3 swift export \ --ckpt_dir finetune_output/checkpoint-478 --load_dataset_config true \ --quant_method awq --quant_bits 4 \ --merge_lora true \ Traceback (most recent call last): File "/home/jianc/miniconda3/envs/benchmark-llm/lib/python3.10/site-packages/swift/cli/export.py", line 5, in...

bug

**Describe the feature** 微调后的模型面目全非,huggingface再也load不进去了。vllm等加速框架也用不了了 **Paste any useful information** 微调完的是adapter_model.safetensors文件,我给他强制改为model.safetensors,然后覆盖到原来的基座的文件夹中 结果load进来,推理完全不对。 **Additional context** 求方法,怎么转回去。

question

from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration import torch from PIL import Image import requests from modelscope import snapshot_download from transformers import AutoModelForCausalLM, AutoTokenizer from peft import AutoPeftModelForCausalLM device_count = torch.cuda.device_count() if...

question

# PR type - [ ] Bug Fix - [x] New Feature - [ ] Document Updates - [ ] More Models or Datasets Support

Qwen1.5微调训练脚本中,我用到了` --dataset new_data.jsonl `这个选项, 可以训练成功,但我看文档有提到`--custom_train_dataset_path`这个选项,这两个有什么区别呢,是不是对自己生成的数据集用` --dataset new_data.jsonl `这种方式是不对的,但是为什么又确实训练成功了呢(至少模型确实学习到了训练资料中的知识) ``` # Experimental environment: A100 # 2*40GB GPU memory CUDA_VISIBLE_DEVICES=0 \ swift sft \ --model_type qwen1half-32b \ --sft_type lora \ --tuner_backend peft...

目前是默认推理只能本机访问,建议增加选项,可以侦听,让其他机器访问。 可以在自己增加启动参数“--host 0.0.0.0 ”,或者修改 swift/ui/llm_infer/llm_infer.py 将: params += f'--port "{deploy_args.port}" ’ 改为: params += f'--port "{deploy_args.port}" --host "0.0.0.0" '

**Describe the bug** What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图) 将vllm版本从0.3.1升级到0.4.0后,使用swift部署模型,在相同模型、相同prompt的情况下,服务请求时间明显变长(2倍以上),server部署命令参数没有做任何修改 CUDA_VISIBLE_DEVICES=1 swift deploy --model_type qwen1half-7b-chat \ --model_cache_dir /data/ssd/LLM_models/qwen/Qwen1.5-7B-Chat \ --infer_backend vllm \ --use_flash_attn true \...

bug