Stas Bekman

Results 664 comments of Stas Bekman

> Are we specifically looking for an implementation of consistency checks to be applied directly to Megatron-DeepSpeed? Yes. I think this is mainly the code starting from: https://github.com/EleutherAI/gpt-neox/blob/49e60fe7ad14f6991a7fa678d3a0c330d09b9ff4/megatron/neox_arguments/arguments.py#L641 Megatron-LM and...

Yes, of course, as I it doesn't look that @jtboing is working on it. I could be wrong of course. As I mentioned earlier it was Stella's recommendation so I...

> I was thinking if we can add a function to validate args before this line: > > https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/04c461ed786ed0e257690e136d3481957b0ef582/megatron/arguments.py#L319 > > 1. There’s a function by name of `_check_arg_is_not_none` which...

@thomasw21, at your convenience - no rush at all - we can merge this when it meets your needs.

Absolutely no rush, you're the only one who asked for it. So it's totally up to you Thomas.

Thank you for offering to work on this, @jtboing We, the BS group, haven't added anything yet to this functionality, so it's totally up to you how you do it...

We have already started sorting it out here: https://github.com/bigscience-workshop/Megatron-DeepSpeed/pull/204 (as a side effect of another need).

Oh wow, you developed a whole package around this. Impressive work, @mayank31398 Let's just discuss the user-facing APIs before implementing to minimize wasting time. Originally, these were just scripts to...

> 1. I found it easier to deploy using DeepSpeed-MII and leverage that for CLI. But I wan't really sure of the overhead it causes, so still using the barebones...

branch, no, as branches are hard to keep in sync - and especially since we will move it out of Meg-DS anyway once you're done . I propose let's create...