whyiug

Results 14 comments of whyiug

Are you still doing the work? Hope to see any progress. @danielhanchen

@younesbelkada Thanks for your reply. My training method is lora, where all linear layers in the base model are frozen, and for my input training set they are not trainable,...

Is it my misunderstanding of lora and backpropagation ? Or maybe people don't have a need for it. @younesbelkada thanks for your advice.

Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.

hi, @YuzaChongyi can we finetune this model with one A100 (40G)?

> > 31.2GB per GPU was tested with two A100 GPU, you can use zero3 + offload to minimize the memory usage. And according to the deepspeed zero strategy, the...

> if you only have an A100, and change ds_config_zero3.json as follows to offload params and optmizer to cpu to save your memory: "zero_optimization": { "stage": 3, "offload_optimizer": { "device":...

@ChunyuanLI Once again, catching eyeballs but delaying the release of the code and weights. Disappointing.

> 最新的minicpmv-llama3模型已经可以batch inference了 ![image](https://private-user-images.githubusercontent.com/38046403/350181888-2106114e-8415-4a9b-ab30-044867ba3a25.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2NDY3MzYsIm5iZiI6MTcyMTY0NjQzNiwicGF0aCI6Ii8zODA0NjQwMy8zNTAxODE4ODgtMjEwNjExNGUtODQxNS00YTliLWFiMzAtMDQ0ODY3YmEzYTI1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzIyVDExMDcxNlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWE3MDQ3NGNlOTg4YjJhY2M5YTNiOGI5NDRlNDZiODA0MDkzMmM2Nzc1ZmQwZDM4MWVhMzIzZTA4OWI3Y2VjNGUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.4CJ8FPwR5yt19m2ccIGXi6V273HGan4RDzocgygwH8I) 这段代码是报错的。用的最新的model commit 3b6aeff3850ce9d5087751911e4771c78004b2b3