swift issues

Results 206 swift issues

Sort by recently updated

VRAM requirement for full sft deepseek VL 7B

**Describe the bug** How much VRAM is needed to finetune the 7b VL model? ```bash # Experimental Environment: A100 # GPU Memory Requirement: 80GB # Runtime: 2.5 hours CUDA_VISIBLE_DEVICES=0 \...

SinanAkkoyun

Support QLoRA with EETQ quantization

**Describe the feature** Please describe the feature requested here(请在这里描述需求) **Paste any useful information** Paste any useful information, including papers, github links, etc.(请在这里描述其他有用的信息，比如相关的论文地址，github链接等) **Additional context** Add any other context or information...

thincal

使用 examples/pytorch/llm/scripts/qwen1half_32b_chat/lora_mp/sft.sh 脚本微调时候的问题

**Describe the bug** 就用的该脚本做的 sft，模型可以正确加载，但是加载数据的时候出错 “TypeError: Value.init() missing 1 required positional argument: 'dtype'” 报错情况为： sft_main() File "/data/anaconda3/envs/cuda12.1/lib/python3.10/site-packages/swift/utils/run_utils.py", line 31, in x_main result = llm_x(args, **kwargs) File "/data/anaconda3/envs/cuda12.1/lib/python3.10/site-packages/swift/llm/sft.py",...

yezhongxiuchan

llama3-8b-instruct awq量化oom

**Describe the bug** What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程，最好有截图) 微调后的llama3-8B拿去量化，4*24G显存都会oom？已经把quant_n_samples和quant_seqlen减小到32/128了 CUDA_VISIBLE_DEVICES=0,1,2,3 swift export \ --model_type llama3-8b-instruct \ --ckpt_dir/home/greatwall/app/edison/output/llama3-8b-instruct/v2-20240427-073919/checkpoint-438-merged \ --quant_bits 4 \ --quant_method awq \...

Edisonwei54

question

[WIP]Support new datasets

# PR type - [ ] Bug Fix - [ ] New Feature - [ ] Document Updates - [x] More Models or Datasets Support # PR information Write the...

tastelikefeet

目前swif的lora微调和adapter微调，token embedding方式相同吗

PowerDispatch

调用p-tuning报错

`import os os.environ['CUDA_VISIBLE_DEVICES'] = '3' from modelscope import Model, AutoModelForSequenceClassification, AutoTokenizer, MsDataset from swift import Swift, LoRAConfig, AdapterConfig, Trainer, TrainingArguments, PromptConfig import torch from transformers import default_data_collator model = Model.from_pretrained('/tf/model/Llama-2-7b-chat-ms/',...

linguoqi

swift
swift copied to clipboard

Metadata

VRAM requirement for full sft deepseek VL 7B

Support QLoRA with EETQ quantization

使用 examples/pytorch/llm/scripts/qwen1half_32b_chat/lora_mp/sft.sh 脚本微调时候的问题

llama3-8b-instruct awq量化oom

[WIP]Support new datasets

目前swif的lora微调和adapter微调，token embedding方式相同吗

调用p-tuning报错

关于win11环境下 CUDA_VISIBLE_DEVICES 无法识别命令的问题

[WIP] Support Hqq and Eetq quantization

DeepseekVL add local_repo_path argument AND infer support delete truncation_strategy

← Metadata

Owner

Metadata

swift swift copied to clipboard

Metadata

← Metadata

Owner

Metadata

swift
swift copied to clipboard