Yu-won Lee comments

Results 230 comments of


                                            Yu-won Lee

Fine-tuning for OCR tasks

You could prepare the data for OCR and start training. You should follow the format I have written in the README. https://huggingface.co/datasets/linxy/LaTeX_OCR This could be helpful for preparing OCR data.

gpu OOM occurs with a specific image

I've trained the vision-encoder alone and it takes a bit much memory with it. But its strange that only the size of 500x800 takes too much. Could you check ```...

gpu OOM occurs with a specific image

Well that could take a bit much memmory but, it should be okay with it. Maybe limiting the image size could be a bit useful. InternVL2.5 uses dynamic resoultion but...

gpu OOM occurs with a specific image

Thanks for letting me know. I'll check again for adjusting the token numbers with `max_pixels`. Also I'll add some args for width and height.

gpu OOM occurs with a specific image

I've updated the code to explicitly set resized_height and resized_width for both images and videos.

DPO support

Yes, but I have no resource to test the DPO method yet so it should take some time.

DPO support

Now I've got some time to work on so, I'll strat trying.

DPO support

@50Bytes-dev I've updated the DPO code. Thanks for waiting. Please let me know if it has some problems. Before you use it. You should update the `trl` to `trl==0.16.1`.

Questions about fine tune only text

It's not quite different. ``` [ { "id": "000000033471", "conversations": [ { "from": "human", "value": "Identify the odd one out: Twitter, Instagram, Telegram" }, { "from": "gpt", "value": "Telegram" },...

Questions about fine tune only text

@GaoMengGladys If there are no key "image" in the file, it would pass the image reading from it. Also, "\n" should be removed from the text.