Anas Awadalla comments

Results 42 comments of


                                            Anas Awadalla

Major refactor to support new architectures

Some other todos I want to add to this: - [X] Deepspeed updates from that branch - [X] Cast logits to fp32 for pure bf16/fp16 (to avoid loss spikes) -...

Major refactor to support new architectures

You said you integrated “deepspeed-related code into the current main branch of Openflamingo”. Have you tried using this branch as is? The integration is basically complete but we are doing...

Training loss curve on MMC4 dataset?

Here are the loss plots for some of our training runs. We also find that the loss on MMC4 decreases more slowly than the loss on LAION. We anticipate that...

Malware Detected on LAION2B-En Images. Can you make Open Flamingo Trainable on a Different Dataset? E.g., COYO-700M?

I think generally you can train on any image text dataset! As long as it is in webdataset format as done by [Image2Dataset](https://github.com/rom1504/img2dataset) codebase.

About the random selection during pre-training

No you’re right this isn’t a perfect way of doing it. Ideally you should create multiple samples from a document if it is too long. Do you plan on adding...

About the random selection during pre-training

Hello! So currently what is going on is that we are "read[ing] images such that there are up to max_num_images valid_images". I think a better way to go about this...

About the random selection during pre-training

Sweet! That would be awesome

Mismatch input type and weight type when training with precision fp16

Thanks for bringing this up! I will take a closer look later today. I do want to point out that we haven't gotten good performance with pure fp16 training. It...

Mismatch input type and weight type when training with precision fp16

Got it. There is this [version](https://huggingface.co/anas-awadalla/mpt-1b-redpajama-200b-hf-style) of mpt I use for testing if you want to give fsdp a shot before the new refactor is merged.

support training on only LAION 2B

As implied by the title we should allow users to train on only laion 2B. This should be very straightforward and involves making mmc4 arguments optional in train.py and refactoring...