byi8220
byi8220
If it's okay/not too complicated, could I try and give this a shot (as a new outside contributor)? Admittedly very new to ML stuff, but at a very high level,...
Thanks, that's very helpful. I'll try to get a PR out soon modeled off that.
> So your checkpoint should already be compatible, no renaming and no reshaping Hm, when I try to run a conversion, I get an error suggesting there needs to be...
> Oh no 😅 I am not getting this one on the original checkpoints, so maybe it was updated at some point? It might have, but I couldn't get it...
> maybe because the weights are tied it used the lm_head's weights and tied using them. I'm confused by what you mean here. I thought that the problem was due...
Hm, just to make sure I understand: 1. The lack of warning about uninitialized weights is because when `tie_word_embeddings=True`, the input embedding layer weights name is somehow ignored at some...
1. Just wondering, if it's an error, should these be logged at warning level or are these generally just warnings: https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L4011-L4030 2. "This avoids having a conversion script" - Wouldn't...
Sg, I removed the weight rename from my PR (although now my PR won't actually work until yours is checked in)
Just a quick plug: I think your PR fixes the checkpointing issue, but PR https://github.com/huggingface/transformers/pull/29705 is still open for config->config conversion.
It appears that way. I am new to this codebase (and looking for a good first contribution), so take any of my opinion with a grain of salt, but this...