Anas Awadalla

Results 42 comments of Anas Awadalla

Some other todos I want to add to this: - [X] Deepspeed updates from that branch - [X] Cast logits to fp32 for pure bf16/fp16 (to avoid loss spikes) -...

You said you integrated “deepspeed-related code into the current main branch of Openflamingo”. Have you tried using this branch as is? The integration is basically complete but we are doing...

Here are the loss plots for some of our training runs. We also find that the loss on MMC4 decreases more slowly than the loss on LAION. We anticipate that...

I think generally you can train on any image text dataset! As long as it is in webdataset format as done by [Image2Dataset](https://github.com/rom1504/img2dataset) codebase.

No you’re right this isn’t a perfect way of doing it. Ideally you should create multiple samples from a document if it is too long. Do you plan on adding...

Hello! So currently what is going on is that we are "read[ing] images such that there are up to max_num_images valid_images". I think a better way to go about this...

Thanks for bringing this up! I will take a closer look later today. I do want to point out that we haven't gotten good performance with pure fp16 training. It...

Got it. There is this [version](https://huggingface.co/anas-awadalla/mpt-1b-redpajama-200b-hf-style) of mpt I use for testing if you want to give fsdp a shot before the new refactor is merged.

As implied by the title we should allow users to train on only laion 2B. This should be very straightforward and involves making mmc4 arguments optional in train.py and refactoring...