LLaMA-Factory icon indicating copy to clipboard operation
LLaMA-Factory copied to clipboard

Unify Efficient Fine-Tuning of 100+ LLMs

Results 548 LLaMA-Factory issues
Sort by recently updated
recently updated
newest added

https://arxiv.org/abs/2405.12130 MoRA, a PEFT technique that uses a square matrix instead of low-rank matrices. The main idea behind MoRA is to use trainable parameters in a way that achieves the...

enhancement
pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction python src/api.py --model_name_or_path /data/models/LLM_models/qwen/Qwen-72B-Chat-Int4 --template qwen --infer_backend vllm --vllm_gpu_util 0.9 --vllm_maxlen 8000 上述配置设置了最大token为8000,当输入token超过8000的时候,流式调用接口的时候还是会返回2条空内容的json数据,vllm底层会有一个警告,提示超过了最大token。咱们代码里面能不能抛出一个异常错误,这样返回的内容便于直观理解。 ###...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction - ### Expected behavior 拉取的最新的代码 提示这个 pytorch allocator cache flushes since last step. this...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 我依照说明在conda虚拟坏境中参照必需步骤和Windows用户指南安装llama-factory,然后尝试使用多 GPU LoRA 微调的命令 CUDA_VISIBLE_DEVICES=0,1 llamafactory-cli train examples/lora_multi_gpu/llama3_lora_sft.yaml, 提示错误 llamafactory.cli - Initializing distributed tasks...

pending

欢迎填写Ascend x LLaMA-Factory用户使用问卷,我们期待您的反馈以帮助昇腾用户提升LLaMA-Factory使用体验🤗 ![image](https://github.com/hiyouga/LLaMA-Factory/assets/167732245/8860eb42-86d2-4a0d-b7f8-40c583bb9990)

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 在官方代码中,可以用类似的方法: generation_output = model.generate( input_ids=model_input['input_ids'], return_dict_in_generate=True, output_scores=True, max_new_tokens=100, ) logits = generation_output.scores 在LLaMA-Factory该如何实现? result...

enhancement
pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 您好,我还有两个问题,第一个是webui页面,为什么损失图以及结果那里写着错误,并且损失图无法加载,记得之前报过一个错,但现在没了,大概意思是web图不是支持的模式这个样子,但一直都没能加载出图片来。 第二个问题是:在训练数据集的时候,每几轮就会报出一句话:Could not find a config file in /mnt/workspace/.cache/modelscope/LLM-Research/Llama3-8B-Chinese-Chat - will assume that the...

pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 训练框架为**LLaMA-Factory-0.7.0** ```shell export NCCL_DEBUG=INFO export NCCL_IB_DISABLE=0 export NCCL_SOCKET_IFNAME=eth10 model_path=codeqwen1.5-7B dataset=codeqwen_0305 outputdir=codeqwen-pt-0527-new0305dataset gradient_accumulation_steps=2 per_device_batchsize=2 epoch_num=2...

pending