byi8220 comments

Results 10 comments of


                                            byi8220

Conversion Script for Mamba checkpoints (`mamba_ssm` -> `transformers`)

If it's okay/not too complicated, could I try and give this a shot (as a new outside contributor)? Admittedly very new to ML stuff, but at a very high level,...

Conversion Script for Mamba checkpoints (`mamba_ssm` -> `transformers`)

Thanks, that's very helpful. I'll try to get a PR out soon modeled off that.

Conversion Script for Mamba checkpoints (`mamba_ssm` -> `transformers`)

> So your checkpoint should already be compatible, no renaming and no reshaping Hm, when I try to run a conversion, I get an error suggesting there needs to be...

Conversion Script for Mamba checkpoints (`mamba_ssm` -> `transformers`)

> Oh no 😅 I am not getting this one on the original checkpoints, so maybe it was updated at some point? It might have, but I couldn't get it...

Conversion Script for Mamba checkpoints (`mamba_ssm` -> `transformers`)

> maybe because the weights are tied it used the lm_head's weights and tied using them. I'm confused by what you mean here. I thought that the problem was due...

Conversion Script for Mamba checkpoints (`mamba_ssm` -> `transformers`)

Hm, just to make sure I understand: 1. The lack of warning about uninitialized weights is because when `tie_word_embeddings=True`, the input embedding layer weights name is somehow ignored at some...

Conversion Script for Mamba checkpoints (`mamba_ssm` -> `transformers`)

1. Just wondering, if it's an error, should these be logged at warning level or are these generally just warnings: https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L4011-L4030 2. "This avoids having a conversion script" - Wouldn't...