Yu-won Lee comments

Results 230 comments of


                                            Yu-won Lee

💡 [REQUEST] - V 2.6 finetune multiple images example

It's a bit weird that the scripts in the repo are set to v2.6.

Much higher GPU memory usage compared to LLaMA-Factory

Thanks for letting me know. First, please make sure that `use_liger` is set to True when you train with this repository. Second, the current version of my repo doesn’t yet...

Much higher GPU memory usage compared to LLaMA-Factory

`"overlap_comm": true,` also this could cause a bit more memory. Its in the `zero3_offload.json`

Much higher GPU memory usage compared to LLaMA-Factory

@baicenxiao Maybe using the reentrant in gradient checkpointing is the difference between mine and the huggingface code you've used. For the llama-factory, I haven't really look into it, so I'm...

Much higher GPU memory usage compared to LLaMA-Factory

I'm not really sure what is the difference. I think llama-factory is based on the code from official qwen-vl repo, but that is not quite different from mine except for...

Much higher GPU memory usage compared to LLaMA-Factory

@baicenxiao I'm not sure why llama-factory uses less vram. I've looked into the code but it's not so different from the code I've made. I've checked the default optimizer and...

Much higher GPU memory usage compared to LLaMA-Factory

I've made a additional monkey patching in the forward fuction, and it will run with much less memory and much faster speed. The code will be updated when the test...

about truncate_sequence function

I plan to add dynamic truncation, but I’m not sure of the best way to implement it—if the limit is too short, it might cut off part of the user’s...

Fail to merge lora of Qwen2.5-VL

That's a bit odd, it should have `config.json`( The full model config) in the directory. Are you using a checkpoint for it?

Fail to merge lora of Qwen2.5-VL

Originally, it was to load the model with the same config you've trained (Becuase you need to merge the weights). Also another thing that is to delete the quantization config...