whyiug comments

Results 14 comments of


                                            whyiug

Support Qwen2

Are you still doing the work? Hope to see any progress. @danielhanchen

Seek help for a more efficient way to use caching to train my model

@younesbelkada Thanks for your reply. My training method is lora, where all linear layers in the base model are frozen, and for my input training set they are not trainable,...

Seek help for a more efficient way to use caching to train my model

Is it my misunderstanding of lora and backpropagation ? Or maybe people don't have a need for it. @younesbelkada thanks for your advice.

bug: "Failed to parse JSON from SSE message" error

I had the same problem.

MiniCPM-Llama3-V 2.5 int4 版本支持微调吗？

Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.

Finetuning Configuration

hi, @YuzaChongyi can we finetune this model with one A100 (40G)?

Finetuning Configuration

> > 31.2GB per GPU was tested with two A100 GPU, you can use zero3 + offload to minimize the memory usage. And according to the deepspeed zero strategy, the...

> if you only have an A100, and change ds_config_zero3.json as follows to offload params and optmizer to cpu to save your memory： "zero_optimization": { "stage": 3, "offload_optimizer": { "device":...

中文ocr效果极差

@ChunyuanLI Once again, catching eyeballs but delaying the release of the code and weights. Disappointing.

是否支持批量推理?

> 最新的minicpmv-llama3模型已经可以batch inference了 ![image](https://private-user-images.githubusercontent.com/38046403/350181888-2106114e-8415-4a9b-ab30-044867ba3a25.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2NDY3MzYsIm5iZiI6MTcyMTY0NjQzNiwicGF0aCI6Ii8zODA0NjQwMy8zNTAxODE4ODgtMjEwNjExNGUtODQxNS00YTliLWFiMzAtMDQ0ODY3YmEzYTI1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzIyVDExMDcxNlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWE3MDQ3NGNlOTg4YjJhY2M5YTNiOGI5NDRlNDZiODA0MDkzMmM2Nzc1ZmQwZDM4MWVhMzIzZTA4OWI3Y2VjNGUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.4CJ8FPwR5yt19m2ccIGXi6V273HGan4RDzocgygwH8I) 这段代码是报错的。用的最新的model commit 3b6aeff3850ce9d5087751911e4771c78004b2b3