swift icon indicating copy to clipboard operation
swift copied to clipboard

ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 25+ MLLMs

Results 206 swift issues
Sort by recently updated
recently updated
newest added

**Describe the feature** 希望能支持在垂域数据上的embedding微调,从而实现更好的RAG效果。 **Paste any useful information** 希望支持的模型列表(https://github.com/chatchat-space/Langchain-Chatchat/wiki/%E6%94%AF%E6%8C%81%E5%88%97%E8%A1%A8#embedding-%E6%A8%A1%E5%9E%8B%E6%94%AF%E6%8C%81%E5%88%97%E8%A1%A8) MokaAI系列嵌入模型 moka-ai/m3e-small moka-ai/m3e-base moka-ai/m3e-large BAAI系列嵌入模型 BAAI/bge-small-zh BAAI/bge-base-zh BAAI/bge-large-zh BAAI/bge-small-zh-v1.5 BAAI/bge-base-zh-v1.5 BAAI/bge-large-zh-v1.5 BAAI/bge-large-zh-noinstruct BAAI/bge-reranker-large BAAI/bge-reranker-base text2vec系列嵌入模型 shibing624/text2vec-base-chinese-sentence shibing624/text2vec-base-chinese-paraphrase shibing624/text2vec-base-multilingual shibing624/text2vec-base-chinese shibing624/text2vec-bge-large-chinese GanymedeNil/text2vec-large-chinese...

enhancement

**Describe the feature** Please describe the feature requested here(请在这里描述需求) 期望支持PiSSA微调 **Paste any useful information** Paste any useful information, including papers, github links, etc.(请在这里描述其他有用的信息,比如相关的论文地址,github链接等) https://arxiv.org/abs/2404.02948 **Additional context** Add any other context...

enhancement

cogvlm推理选用大于1的num_beam会报错。当设置为2时报错如下: hin a limit of 100 words. Answer:[OUTPUT]Traceback (most recent call last): File "/root/swift/swift/cli/infer.py", line 5, in infer_main() File "/root/swift/swift/utils/run_utils.py", line 31, in x_main result = llm_x(args, **kwargs) File "/root/swift/swift/llm/infer.py",...

CUDA_VISIBLE_DEVICES=0,1 \ swift sft \ --model_type cogvlm-17b-instruct \ --sft_type lora \ --tuner_backend swift \ --dtype bf16 \ --output_dir output \ --dataset coco-mini-en-2 \ --train_dataset_sample -1 \ --num_train_epochs 6 \ --max_length...

这是我微调模型时的参数: ``` SftArguments( model_type='qwen1half-7b-chat', model_id_or_path='../models/models/qwen/Qwen1___5-7B-Chat', model_revision='master', sft_type='lora', freeze_parameters=0.0, additional_trainable_parameters=[], tuner_backend='swift', template_type='qwen', output_dir='/home/centos/xiaolv/太安模型微调/swift_qwen/output/qwen1half-7b-chat-swift/qwen1half-7b-chat/v1-20240327-083203', add_output_dir_suffix=True, ddp_backend='nccl', ddp_find_unused_parameters=None, ddp_broadcast_buffers=None, seed=42, resume_from_checkpoint=None, dtype='bf16', dataset=['_custom_dataset'], dataset_seed=42, dataset_test_ratio=0.01, train_dataset_sample=-1, train_dataset_mix_ratio=None, train_dataset_mix_ds=['ms-bench'], val_dataset_sample=None, use_loss_scale=False, system='You are a...

question

**Describe the feature** 希望在训练和推理中有开启RoPE外推的选项,如yi-6b-chat的4K推16K/32K,并通过--max_length控制上限和自动计算对应的缩放参数 **Paste any useful information** https://github.com/01-ai/Yi/issues/453 https://github.com/01-ai/Yi/issues/282 https://kaiokendev.github.io/til#extending-context-to-8k https://arxiv.org/pdf/2306.15595.pdf https://github.com/01-ai/Yi/issues/37 https://arxiv.org/pdf/2311.04879.pdf **Additional context** 其他支持RoPE的模型应该都差不多的路子?我没仔细查,盲猜差不多

enhancement

untimeError: Event loop is closed Traceback (most recent call last): File "/home/zli/miniconda3/lib/python3.8/site-packages/gradio/queueing.py", line 501, in call_prediction output = await route_utils.call_process_api( File "/home/zli/miniconda3/lib/python3.8/site-packages/gradio/route_utils.py", line 253, in call_process_api output = await app.get_blocks().process_api(...

**Describe the feature** The original implementation does support and there is need to fine-tune the LLM only for some textual knowledge sometimes. Right now the repo says it only support...

Thanks for your work and the repo! As I understand, the inference for multimodal llm (eg llava, qwen-vl) can only be run in batch via the provided scripts here: https://github.com/modelscope/swift/blob/main/docs/source/Multi-Modal/llava最佳实践.md#微调后推理...

question

命令行参数文件中说默认采样比例是0.01,我有一个大小为200k的数据集,4卡训练,NPROC_PER_NODE=2\ CUDA_VISIBLE_DEVICES=0,1,2,3 \那么模型验证的时候,验证的数量为1000是正常的吗?还是说1000并不代表1000条数据?谢谢 ![image](https://github.com/modelscope/swift/assets/89635780/9d7c23f5-00da-4959-989c-9f0e79125400)

question