swift icon indicating copy to clipboard operation
swift copied to clipboard

ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 25+ MLLMs

Results 206 swift issues
Sort by recently updated
recently updated
newest added

> [INFO:swift] InternVLChatModel: 25514.1861M Params (25514.1861M Trainable [100.0000%]), 402.6563M Buffers. [INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语). [INFO:swift] Input `exit` or `quit` to exit the...

看到git中列举了swift支持的llm、vllm等,想问一下swift是否支持clip/chinese-clip的微调(既然都是阿里系应该会适配吧)?如果有的话是否有对应的md/ipynb文件参考?

# PR type - [x] Bug Fix - [ ] New Feature - [ ] Document Updates - [ ] More Models or Datasets Support # PR information fix list...

CUDA_VISIBLE_DEVICES=0 \ python3 llm_sft.py \ --model_type qwen1half-moe-a2_7b-chat \ --model_id_or_path /root/yovole/qwen/Qwen1.5-MoE-A2.7B-Chat \ --sft_type lora \ --tuner_backend swift \ --dtype AUTO \ --output_dir output \ --dataset dureader-robust-zh \ --train_dataset_sample 10000 \ --num_train_epochs...

**Describe the feature** 在Eval评测数据集的同时,希望同时输出对所测试的模型已经训练过同个数据集的可能性的评估 **Paste any useful information** https://github.com/swj0419/detect-pretrain-code-contamination/ **Additional context** 有助于确保模型调优效果的增加来源不是数据泄漏 类似https://github.com/abacusai/smaug ,会同时给出训练前后的泄漏评估对比

enhancement

nproc_per_node=4 CUDA_VISIBLE_DEVICES=0,1,2,3 \ NPROC_PER_NODE=$nproc_per_node \ swift sft \ --model_id_or_path "AI-ModelScope/llava-v1.6-mistral-7b" \ --template_type "llava-mistral-instruct" \ --custom_train_dataset_path train_swift.json \ --custom_val_dataset_path test_swift.json \ --dataset_test_ratio "0.15" \ --save_steps "20" \ --lora_target_modules q_proj k_proj v_proj...

**Describe the feature** Please describe the feature requested here(请在这里描述需求) **Paste any useful information** Paste any useful information, including papers, github links, etc.(请在这里描述其他有用的信息,比如相关的论文地址,github链接等) **Additional context** Add any other context or information...

# PR type - [ ] Bug Fix - [x] New Feature - [ ] Document Updates - [ ]More Models or Datasets Support # PR information ``` dataset_info =...

**Describe the bug** What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图) Running finetuned `internlm-xcomposer2-7b-chat` lead to error. ``` token len:history:113, query:1706 Traceback (most recent call last): File...

**Describe the bug** What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图) Finetuning model `ModelType.qwen_vl_chat` with `max_length=4096` But in inference with the checkpoint, got `exceedsthe model max_length: 2048`...