Stas Bekman

Results 664 comments of Stas Bekman

Please see: https://github.com/bigscience-workshop/Megatron-DeepSpeed/pull/308

The original Meg-DS checkpoint is here: https://huggingface.co/bigscience/bloom-optimizer-states

> is there an equivalent checkpoint for inference that is in the Meg-DS format, https://huggingface.co/bigscience/bloom-optimizer-states is the full Meg-DS checkpoint. edit: hmm, I think you're correct it's incomplete. I will...

@asafkar, so it looks like I created the new repo for nothing, the https://huggingface.co/bigscience/bloom-optimizer-states was already the full checkpoint. Why did you say it only had optim state files and...

- DS-Inference = TP - DS-ZeRO = TP-like - Accelerate = PP - Megatron-Deepspeed = TP+PP (plus DP in all)

Could you be a bit more specific, Iz? To run Meg-DS training? I have the more or less ready AWS image I created for the CI - but I'm definitely...

The only problem with this pre-made image is that our components are in flax - e.g. we get fixes in the deepspeed repo, Meg-DS gets changed too and so are...

@philschmid, could you please help me to make this image we made for Megatron-Deepspeed CI somehow available to the wider group? Basically anybody at BigScience. I'm not sure if we...

That would be fantastic! Thank you, Philipp! I think a few small tweaks will be needed to the last one I created. As the latter was done for CI and...

@ibeltagy, is this going to be used on EC2 on user's personal account or some HF account or else?