CrazyBrick

Results 13 comments of CrazyBrick

> Thanks @aneet-javis This is the published dataset for finetuning: https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K/blob/main/llava_v1_5_mix665k.json It seems that I have browsed it under other issues before: Adding some plain text Q&A from llava_v1_5_mix665k to...

> ## What does this pr do > > This pr adds support to save and load sharded gptq checkpoint. > > Currently implemented: > > * save sharded gptq...

> 不要用他的安装方法 ,先uninstall > > 然后 pip install auto-gptq optimum 我uninstall且install,但是遇到了另外一个问题: ```File "/home/wz/.cache/huggingface/modules/transformers_modules/Qwen-VL-Chat-Int4/modeling_qwen.py", line 657, in forward hidden_states[i][a + 1 : b] = images[idx] RuntimeError: a view of a leaf...

> > > 2023-07-20 22:03:08,521 - mmrotate - INFO - Epoch [1][12750/12799] lr: 2.000e-04, eta: 10:36:31, time: 0.267, data_time: 0.002, memory: 8735, loss_rpn_cls: 0.0439, loss_rpn_bbox: 0.1248, loss_cls: 0.2421, acc: 91.7148,...

> we found the finetuned llava model is underfitting by seting the epoch as 1-10, even the prediction of train data is wrong! so **num_train_epochs 100** will make your loss...

@Mr-Teal ,Hi, have you solved this problem?

> > @Mr-Teal ,Hi, have you solved this problem? > > No. I just used their pretrained model instead. thank you for your reply!

> > @Mr-Teal ,Hi, have you solved this problem? > > No. I just used their pretrained model instead. hi, I guess it may be due to version differences(of torch...

> 1. Did you use a LLM model that is pretrained with Chinese data of large scale (and ensure the tokenizer is suitable for Chinese tokenization) ? > 2. Did...

> You can check the exact parameter names of these layers and set `p.requires_grad = True`. Make sure it's not set to False again after this setting. Or you can...