Arthur comments

Results 795 comments of


                                            Arthur

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

Hey @Aisuko, could you provide a **minimal** reproducer ? That would help use! Also note that the `generation parameters` issues can probably be safely ignored. The missing keys is however...

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

Sorry, could you just push a model to the hub, (the one you trained). No need for the full training loop

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

@humanely do you have the exact same issue? If not then open a separate issue. 1. the checkpoints you have did not save `['lm_head.weight', 'model.decoder.embed_tokens.weight']`. Now if you use `tie_word_embeddings`...

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

@vanguardapps it is usually very safe to do use the trainer, and pretty much rare to have a bug. I don't know which version of transformers you are using, but...

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

Thanks, would help me know if I should ping someone else or if this is ignorable, or if this is core modeling!

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

Awesome, that is already a good isolation. cc @pacman100, @muellerzr and @SunMarc when @vanguardapps share the reproducer, please have a look! 🤗

Arthur

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

bart-large-xsum model: There were missing keys in the checkpoint model loaded: ['model.encoder.embed_tokens.weight', 'model.decoder.embed_tokens.weight', 'lm_head.weight'].

[WIP] Add FLMR model

Padding causes forward to produce different logits (Llama2-7b)

Padding causes forward to produce different logits (Llama2-7b)

Padding causes forward to produce different logits (Llama2-7b)