Yu-won Lee comments

Results 230 comments of


                                            Yu-won Lee

Environment Conflict Issue

It seems the code is a bit old. Could you update to the latest one and use the environment I wrote? Or you could use the docker image I've made.

[Urgent] OSError when loading LoRA fine-tuned checkpoint — missing model file

Sorry for the late response. You should first merge the weight with the base model when using lora with the script you are using.

RuntimeError When Saving Phi 3.5 Vision Due to Shared Tensors

@jjbuck @vjagannath786 @leestott You could check on to the code I made. The model could be saved by removing the wte weight in the Trainer class too. You could inherit...

RuntimeError When Saving Phi 3.5 Vision Due to Shared Tensors

@vjagannath786 You need to copy the `chat_template` from the tokenizer to the processor it self. The latest transformers has changed the classes so it occurs error when using it. You...

Grounding 微调后效果不理想

Did you tried with the original model? I'm a bit curious that original model struggles with the task. 1. I think LoRA is not the problem. 2. CNN models or...

Qwen released the offlicial code for fine tuning. There is a code how to format the grounding data into a qwen format. https://github.com/QwenLM/Qwen2.5-VL/blob/main/qwen-vl-finetune/tools/process_bbox.ipynb

flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol

It might be the coflict of torch version and flash-attn version. The setup was using cuda12.4 so, I think you need to install the torch version that supports your cuda...

MiniCPM V2.6 Support

@diptanu I've monkey patched becuase of the model dosen't use `image_sizes`, ``` from outlines.models import TransformersVision original_generate = TransformersVision.generate def patched_generate(self, prompts, media, generation_parameters, logits_processor, sampling_parameters): inputs = self.processor( text=prompts,...

MiniCPM V2.6 Support

I've solved the problem with using Logits processor. ``` model_id = "openbmb/MiniCPM-V-2_6" model = AutoModel.from_pretrained(model_id, trust_remote_code=True, torch_dtype=torch.bfloat16, attn_implementation='flash_attention_2') tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True) class Event(BaseModel): event: TrafficEvent...

MiniCPM V2.6 Support

@elloza I haven't seen how the `model.chat` works in the new model. If it works the same as the older one, then it should work.