swift issues

Results 206 swift issues

Sort by recently updated

V100推理internVL-1.5问题

> [INFO:swift] InternVLChatModel: 25514.1861M Params (25514.1861M Trainable [100.0000%]), 402.6563M Buffers. [INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语). [INFO:swift] Input `exit` or `quit` to exit the...

NLP-Learning

swift支持模型范围问题

看到git中列举了swift支持的llm、vllm等，想问一下swift是否支持clip/chinese-clip的微调(既然都是阿里系应该会适配吧)？如果有的话是否有对应的md/ipynb文件参考？

XKCUW

fix list index out of range bugs of model_name and model_author

# PR type - [x] Bug Fix - [ ] New Feature - [ ] Document Updates - [ ] More Models or Datasets Support # PR information fix list...

tiandiweizun

训练qwen1.5-moe-A2.7B-chat速度缓慢，GPU利用率低

CUDA_VISIBLE_DEVICES=0 \ python3 llm_sft.py \ --model_type qwen1half-moe-a2_7b-chat \ --model_id_or_path /root/yovole/qwen/Qwen1.5-MoE-A2.7B-Chat \ --sft_type lora \ --tuner_backend swift \ --dtype AUTO \ --output_dir output \ --dataset dureader-robust-zh \ --train_dataset_sample 10000 \ --num_train_epochs...

yangzhipeng1108

希望支持评估模型数据泄漏的可能性

**Describe the feature** 在Eval评测数据集的同时，希望同时输出对所测试的模型已经训练过同个数据集的可能性的评估 **Paste any useful information** https://github.com/swj0419/detect-pretrain-code-contamination/ **Additional context** 有助于确保模型调优效果的增加来源不是数据泄漏类似https://github.com/abacusai/smaug ，会同时给出训练前后的泄漏评估对比

WSC741606

enhancement

使用DDP运行时显存不够，但是使用Model Parallel时可以正常finetune，耗时很大

nproc_per_node=4 CUDA_VISIBLE_DEVICES=0,1,2,3 \ NPROC_PER_NODE=$nproc_per_node \ swift sft \ --model_id_or_path "AI-ModelScope/llava-v1.6-mistral-7b" \ --template_type "llava-mistral-instruct" \ --custom_train_dataset_path train_swift.json \ --custom_val_dataset_path test_swift.json \ --dataset_test_ratio "0.15" \ --save_steps "20" \ --lora_target_modules q_proj k_proj v_proj...

AlexJJJChen

Support dataset packing feature to maximize the compute power

**Describe the feature** Please describe the feature requested here(请在这里描述需求) **Paste any useful information** Paste any useful information, including papers, github links, etc.(请在这里描述其他有用的信息，比如相关的论文地址，github链接等) **Additional context** Add any other context or information...

thincal

swift
swift copied to clipboard

Metadata

V100推理internVL-1.5问题

swift支持模型范围问题

fix list index out of range bugs of model_name and model_author

训练qwen1.5-moe-A2.7B-chat速度缓慢，GPU利用率低

希望支持评估模型数据泄漏的可能性

使用DDP运行时显存不够，但是使用Model Parallel时可以正常finetune，耗时很大

Support dataset packing feature to maximize the compute power

traindataset异常提示

Infer internlm-xcomposer2 lead to `ValueError: Input length of input_ids is 0, but `max_length` is set to -1066.`

finetuning model with `max_length=4096`, but in infer got `exceeds the model max_length: 2048'

← Metadata

Owner

Metadata

swift swift copied to clipboard

Metadata

← Metadata

Owner

Metadata

swift
swift copied to clipboard