Sanjay Subramanian issues

Results 10 issues of


                                            Sanjay Subramanian

COCO split for pre-training

Hi @dandelin , thanks for this great repo and work! Could you please say what COCO split was used for pre-training? (was it 2014, 2017, Karpathy, or something else?) Thanks!

[Usage] Different results on demo site and local

### Describe the issue Issue: I'm getting different results for the example below on the demo site (https://llava.hliu.cc/) and when I run locally. The input user text is "Person 0's...

Pre-trained model

Could you please release a pre-trained model for the final DuoRAT system (which gets 69.9% \pm 0.8 accuracy on the dev set of Spider)?

Using float 16 in training?

I've noticed that in training some tensors are of the float 16 datatype, whereas in validation, I only see float 32. is that in line with what you see? Is...

Is it possible to run BERT Large with Multiple-GPUs?

Is it possible to run the BERT large version with multiple GPUs? For example, rather than have a single 32 GB gpu, I would like to use two 16 GB...

Loss / metric curves

Could the authors or someone else who has successfully reproduced the training of MotionGPT share your train loss and R_TOP_3 (or R_TOP_1/R_TOP_2) curves so I can see if my training...

HSDP question

Thanks for releasing this great work! I was able to get the training to run with sequence length 1024 on the Llama 8B model on 24GB GPUs. I would like...

Segment lengths in E2E dataset

The description of the E2E challenge (https://waymo.com/open/challenges/2025/e2e-driving/) says that each segment is 20 seconds. In the training data files, several videos are longer than 150 frames. Do each of these...

E2E baseline

Would you be able to provide any information on the E2E baseline?

Taking a long time / doesn't answer

### What version of Codex is running? 0.58.0 ### What subscription do you have? API ### Which model were you using? gpt-5-codex medium and gpt-5.1-codex medium ### What platform is...

bug

agent