LMFlow issues

Multiple rounds of training

1

Hello, if I want to train vicuna for multiple rounds of dialogue, how should I format the data set? Can you give me an example? Thanks

Yummy416

I was wondering why the gpt2-large model I downloaded from huggingface was 3.1G, but after run_fineturn_with_lora_save_aggregated_weightss.sh it was only 1.5G. This may be a defect in my professional knowledge. Sorry...

Mikivishy

pending

[BUG] deepspeed.runtime.zero.utils.ZeRORuntimeException: You are using ZeRO-Offload with a client provided optimizer

2

Run /scripts/run_raft_align.sh in docker and get an error. deepspeed.runtime.zero.utils.ZeRORuntimeException: You are using ZeRO-Offload with a client provided optimizer () which in most cases will yield poor performance. Please either use...

LUMO666

pending

[BUG]Output my input

4

After I finished fine-tuning with the full parameters, instead of using your chatbot, I called it in the way of pipeline. Why would I output my input in answer every...

Yummy416

pending

Experiments for speculative_decoding

3

"We tested the speculative inference using the first 100 inputs from alpaca test dataset as prompts. When model=gpt2-xl, draft_model=gpt2". I want to test speedup for my own model and draft_model....

taoxunqiang

Perform Inference using Python instead of command line

1

Hi, is there any example of python code that can run inference instead of using the command line? If so can you kindly share

vashiegaran

Use Ollama to perform inference with trained model using LMFLOW

1

Is there a way we can use the trained model using lmflow to integrate Ollama to perform inference

lauvindra

No checkpoints saved with full finetune even when I set --save_steps

1

I tried to finetune model with the following scripts: ``` bash scripts/run_finetune.sh ``` As I can see, the commands in this script was like: ``` #!/bin/bash # Please run this...

cauwulixuan

baichuan-inc_Baichuan2-7B-Chat can't training

2

感觉是baicihuan2的tokenizer做了更新，可能需要适配一下报了个这样的错误： Tokenizer class BaichuanTokenizer does not exist or is not currently imported. 版本的话是用的最新的代码，v0.05。感觉改一下tokenizer应该就可以。

xiaohangguo

pending

LoRA + FlashAttention2 speed up？

1

When fine-tuning Mistral with LoRA, do you think FlashAttention2 helps in speeding up the process? If yes, how significant is the acceleration? Where is the primary acceleration achieved?

zhoumengbo

LMFlow
LMFlow copied to clipboard

Metadata

Multiple rounds of training

[BUG]Model size change

[BUG] deepspeed.runtime.zero.utils.ZeRORuntimeException: You are using ZeRO-Offload with a client provided optimizer

[BUG]Output my input

Experiments for speculative_decoding

Perform Inference using Python instead of command line

Use Ollama to perform inference with trained model using LMFLOW

No checkpoints saved with full finetune even when I set --save_steps

baichuan-inc_Baichuan2-7B-Chat can't training

LoRA + FlashAttention2 speed up？

← Metadata

Owner

Metadata

LMFlow LMFlow copied to clipboard

Metadata

← Metadata

Owner

Metadata

LMFlow
LMFlow copied to clipboard