Siddharth Choudhary comments

Results 12 comments of


                                            Siddharth Choudhary

questions about test result

Hi @BIT-TYJ, All the vertices corresponding to the first robot will be prefixed using 'a' using gtsam.Symbol like gtsam.Symbol('a', 1), gtsam.Symbol('a',2) etc. Similarly the second robot will be prefixed using...

questions about test result

@BIT-TYJ, you will have to create a GTSAM graph and write to a g2o file. You can checkout: https://gtsam.org/tutorials/intro.html on how to create GTSAM graph. Look at this example on...

Instruction about training Open-Flamingo from scratch

Hi @anas-awadalla, Thanks for the great repo. I'm trying to reproduce OpenFlamingo results using mpt-1b-redpajama-200b with a single 40GB A100 node. Even though the results on VQA tasks are similar...

Instruction about training Open-Flamingo from scratch

@anas-awadalla I trained for approximately 10M samples. Zero-shot COCO CIDEr is 36.55 for me vs 75.9 using the released model. I think one of the issue is that the loss...

Instruction about training Open-Flamingo from scratch

Thanks @anas-awadalla. This is super helpful. I'll train the models longer and check the performance after 10M mmc4 + 20M laion.

Instruction about training Open-Flamingo from scratch

@anas-awadalla I get similar values as above after going through 150M samples. Thanks for the help! Next I'm trying to train a larger model with MPT-7B (anas-awadalla/mpt-7b). Wondering how much...

Instruction about training Open-Flamingo from scratch

@anas-awadalla Using FSDP args mentioned above with MPT-7B, I get this error: ``` File "/root/.cache/huggingface/modules/transformers_modules/anas-awadalla/mpt-7b/b772e556c8e8a17d087db6935e7cd019e5eefb0f/modeling_mpt.py", line 184, in forward (attn_bias, attention_mask) = self._attn_bias(device=x.device, dtype=x.dtype, attention_mask=attention_mask, prefix_mask=prefix_mask, sequence_id=sequence_id) File "/usr/local/lib/python3.8/dist-packages/torch/utils/_contextlib.py", line...

Instruction about training Open-Flamingo from scratch

Thanks @anas-awadalla. Similar to LAION forward pass, I added these lines which made it work: ``` input_ids = input_ids.to(device_id, dtype=cast_dtype, non_blocking=True) attention_mask = attention_mask.to(device_id, dtype=cast_dtype, non_blocking=True) ``` However, this issue...

Instruction about training Open-Flamingo from scratch

> Hmm no we don't run into these. Just to confirm you are using torch 2.0.1? Yes, my torch version is 2.0.1+cu117. Do you have a docker container as well...

Instruction about training Open-Flamingo from scratch

@anas-awadalla I started a training with MPT-7B on 80GB nodes itself. However, I see vqa numbers going down with increasing number of samples seen. Did you see something similar? Here...