Saaketh Narayan comments

Results 82 comments of


                                            Saaketh Narayan

Cannot Load MDS Dataset

@wizyoung That does sound like unintended behavior. If you have a repro, can you open a separate issue please?

Cannot Load MDS Dataset

@naston is this still an issue for you? and @ethantang-db @XiaohanZhangCMU , any luck repro-ing?

@schopra8 based on the warning you're getting, I'm suspecting that it's possible you may not be setting up distributed training correctly. Composer's -n argument (see [here](https://github.com/mosaicml/composer/blob/7fa03545cc2025f256d914abc111a068d239d632/composer/cli/launcher.py#L44)) sets the number of...

write to S3 is very slow

Hey @charliedream1, have you tried the parallel dataset conversion approach as detailed in our docs below? https://docs.mosaicml.com/projects/streaming/en/stable/preparing_datasets/parallel_dataset_conversion.html Please let us know if that works for you.

Resume of data conversion?

@huxuan @abhijithneilabraham This isn't something that's currently on our roadmap right now, but if you have ideas for how we would improve MDSWriter to make this functionality possible, that would...

Optional dependency for different storages?

Hey @huxuan, could you please clarify which dependencies you need and not need? Right now we have the large cloud providers as required dependencies (AWS, GCP, Azure, OCI) and other...

Is FlashAttention really used while using HuggingFaceModel supported as one of ComposerModel types.

Hey, we'd recommend that you use our llm-foundry repo, which uses composer extensively and also supports using HF models. Check it out [here](https://github.com/mosaicml/llm-foundry)!

Is FlashAttention really used while using HuggingFaceModel supported as one of ComposerModel types.

Hey so there are three cases you'll have when using llm-foundry: First, using an MPT model. This has configurable attention, and supports flash attention. Second, using a Llama model. There...

Correctly process `parallelism_config['tp']` when it's a dict

@mvpatel2000 I keep getting errors on 4 gpu tests because it's detecting the `FutureWarning` and counting that as an error. I've tried filtering out these warnings but that doesn't seem...

Correctly process `parallelism_config['tp']` when it's a dict

@mvpatel2000 I already added this `@pytest.mark.filterwarnings('ignore:.*(TP) is experimental.*:FutureWarning')` on all relevant tests...