Yu-won Lee comments

Results 230 comments of


                                            Yu-won Lee

Finetuning with LoRA and video training data.

``` res = model.load_state_dict(non_lora_trainables, strict=False) print(f"[non-lora] missing: {len(res.missing_keys)}, unexpected: {len(res.unexpected_keys)}") print("example missing (merger):", [k for k in res.missing_keys if "visual.merger" in k][:10]) ``` check if the prefixes are correct. ```...

Finetuning with LoRA and video training data.

model.generate will use some other generation settings such as top_k or top_p. You should check those.

RuntimeError: argmax(): does not support bool input

I haven't tried DPO with qwen3, I'll check this one.

[Question] Argument use_liger_kernel missing in train_sft.py

Yes it still supports liger_kernel. I've just leaved it to huggingface trainer to apply it.

[Question] Argument use_liger_kernel missing in train_sft.py

``` #!/bin/bash # MODEL_NAME="Qwen/Qwen2-VL-7B-Instruct" # MODEL_NAME="Qwen/Qwen2-VL-2B-Instruct" # MODEL_NAME="Qwen/Qwen2.5-VL-3B-Instruct" # MODEL_NAME="Qwen/Qwen2.5-VL-7B-Instruct" MODEL_NAME="Qwen/Qwen3-VL-8B-Instruct" GLOBAL_BATCH_SIZE=8 BATCH_PER_DEVICE=2 NUM_DEVICES=4 GRAD_ACCUM_STEPS=$((GLOBAL_BATCH_SIZE / (BATCH_PER_DEVICE * NUM_DEVICES))) export PYTHONPATH=src:$PYTHONPATH # If you want to set the min pixels...

[Question] Argument use_liger_kernel missing in train_sft.py

@pulkitkumar95 I've made a issue in liger-kernel github. And responded they couldn't reproduce this issue. I'm still not sure why it happens. I'll make some experiments when I have some...

[Question] Argument use_liger_kernel missing in train_sft.py

@rayyychen For now, could you use the liger-kernel false?

Impossible to start GRPO training

Sorry, I haven't test the grpo with the updated library version. I'll make a quick update with it. Sorry again for the inconvinience.

Impossible to start GRPO training

The fix I could think is to upgrade the trl to `0.23.1` and fix my code a bit. The library versions should be a bit newer one cuz I'm going...

Impossible to start GRPO training

I've fixed the code to work with GRPO. You need to upgrade the trl version to `0.23.1`. I'll make an update in the docker image soon as possible.