Siddharth Choudhary

Results 12 comments of Siddharth Choudhary

Hi @BIT-TYJ, All the vertices corresponding to the first robot will be prefixed using 'a' using gtsam.Symbol like gtsam.Symbol('a', 1), gtsam.Symbol('a',2) etc. Similarly the second robot will be prefixed using...

@BIT-TYJ, you will have to create a GTSAM graph and write to a g2o file. You can checkout: https://gtsam.org/tutorials/intro.html on how to create GTSAM graph. Look at this example on...

Hi @anas-awadalla, Thanks for the great repo. I'm trying to reproduce OpenFlamingo results using mpt-1b-redpajama-200b with a single 40GB A100 node. Even though the results on VQA tasks are similar...

@anas-awadalla I trained for approximately 10M samples. Zero-shot COCO CIDEr is 36.55 for me vs 75.9 using the released model. I think one of the issue is that the loss...

Thanks @anas-awadalla. This is super helpful. I'll train the models longer and check the performance after 10M mmc4 + 20M laion.

@anas-awadalla I get similar values as above after going through 150M samples. Thanks for the help! Next I'm trying to train a larger model with MPT-7B (anas-awadalla/mpt-7b). Wondering how much...

@anas-awadalla Using FSDP args mentioned above with MPT-7B, I get this error: ``` File "/root/.cache/huggingface/modules/transformers_modules/anas-awadalla/mpt-7b/b772e556c8e8a17d087db6935e7cd019e5eefb0f/modeling_mpt.py", line 184, in forward (attn_bias, attention_mask) = self._attn_bias(device=x.device, dtype=x.dtype, attention_mask=attention_mask, prefix_mask=prefix_mask, sequence_id=sequence_id) File "/usr/local/lib/python3.8/dist-packages/torch/utils/_contextlib.py", line...

Thanks @anas-awadalla. Similar to LAION forward pass, I added these lines which made it work: ``` input_ids = input_ids.to(device_id, dtype=cast_dtype, non_blocking=True) attention_mask = attention_mask.to(device_id, dtype=cast_dtype, non_blocking=True) ``` However, this issue...

> Hmm no we don't run into these. Just to confirm you are using torch 2.0.1? Yes, my torch version is 2.0.1+cu117. Do you have a docker container as well...

@anas-awadalla I started a training with MPT-7B on 80GB nodes itself. However, I see vqa numbers going down with increasing number of samples seen. Did you see something similar? Here...