A9isha

Results 12 comments of A9isha

> Change LGTMs but without a test I'm skeptical. I don't think this needs to be tested as exhaustively but how could we test it somewhat? @rwitten I have tested...

> @A9isha Hello, do you by any chance have a script that does the opposite, converting HF to Orbax? We have the script [llama_or_mistral_ckpt.py](https://github.com/google/maxtext/blob/main/MaxText/llama_or_mistral_ckpt.py) to convert the original PyTorch Llama2...

Yes support for HF datasets in MaxText is on the way @aireenmei

Hi @peregilk , there isn't one yet but we will add one very soon! thanks for your patience.

Hi @peregilk , Sorry for the delay. We have a PR in the works: https://github.com/google/maxtext/pull/581 If you are up for a bit of an experiment, do you want to give...

@peregilk Right I think this is caused by a recent breaking change in the way we are generating MaxText's Orbax checkpoints. https://github.com/google/maxtext/pull/568 Could you please regenerate your MaxText checkpoint with...

`expansion_factor_real_data ` was added in this PR https://github.com/google/maxtext/pull/187 But it has a default value in `base.yml` https://github.com/google/maxtext/blob/main/MaxText/configs/base.yml#L167 Could you check if your current (re)cloned repo has this PR's updates for...

@peregilk Apologies for the delayed response, I was OOO for sometime. >My main sanity check here is if I am able to do a warm restart of the Mistral-7b model...

> Hi @A9isha, does Maxtext support the other way round now? That's converting HF's Llama or Mistral weights to MaxText checkpoints. Thanks We have the script [llama_or_mistral_ckpt.py](https://github.com/google/maxtext/blob/main/MaxText/llama_or_mistral_ckpt.py) to convert the...

I see, unfortunately no there isn't the conversion script at the moment. It should be a modification of [llama_or_mistral_ckpt](https://github.com/google/maxtext/blob/main/MaxText/llama_or_mistral_ckpt.py). If you are interested, please feel free to send across a...