Marc Sun comments

Results 59 comments of


                                            Marc Sun

Llama3 with LlamaForSequenceClassification - Shape mismatch error

Hi @parasurama, thanks for reporting ! I'll have a look asap

Llama3 with LlamaForSequenceClassification - Shape mismatch error

Hi @parasurama, this happens because you changed `max_position_embeddings` attribute. This modified a lot of weights and the whole model needs to be retrained. For now, we don't support loading mismatched...

Llama3 with LlamaForSequenceClassification - Shape mismatch error

This happens because the default vocab_size of LlamaConfig is 32000 but llama v3 checkpoint have a vocab_size of 128256 but llama v2 checkpoint have a vocab_size of 32000. So by...

Error while moving model to GPU `NotImplementedError: Cannot copy out of meta tensor; no data!`

Hi @goelayu, this is expected since `with torch.device('meta')` also puts the buffers on the `meta` device. However, non persistant buffers are not saved in the `state_dict`. So, in the case...

BitsNBytes 4 bit quantization error message typo and logical errors in error message handling

Hi @jkterry1, thanks for this detailed report ! For 3. and 4. , let me know if you want to submit a PR to fix the logger message and split...

Question about an error when using mixed-precision-training on V100

Closing this since the issue is solved !

[CI] Quantization workflow

I see that `transformers-all-latest-gpu` docker image is not being updated for the last two days since the [installation](https://github.com/huggingface/transformers/actions/runs/7924158967/job/21635225922) fails because of aqml library that requires python 3.10 at least and...