Sanjay Subramanian
Sanjay Subramanian
Hi @dandelin , thanks for this great repo and work! Could you please say what COCO split was used for pre-training? (was it 2014, 2017, Karpathy, or something else?) Thanks!
### Describe the issue Issue: I'm getting different results for the example below on the demo site (https://llava.hliu.cc/) and when I run locally. The input user text is "Person 0's...
Could you please release a pre-trained model for the final DuoRAT system (which gets 69.9% \pm 0.8 accuracy on the dev set of Spider)?
I've noticed that in training some tensors are of the float 16 datatype, whereas in validation, I only see float 32. is that in line with what you see? Is...
Is it possible to run the BERT large version with multiple GPUs? For example, rather than have a single 32 GB gpu, I would like to use two 16 GB...
Could the authors or someone else who has successfully reproduced the training of MotionGPT share your train loss and R_TOP_3 (or R_TOP_1/R_TOP_2) curves so I can see if my training...
Thanks for releasing this great work! I was able to get the training to run with sequence length 1024 on the Llama 8B model on 24GB GPUs. I would like...
The description of the E2E challenge (https://waymo.com/open/challenges/2025/e2e-driving/) says that each segment is 20 seconds. In the training data files, several videos are longer than 150 frames. Do each of these...
Would you be able to provide any information on the E2E baseline?
### What version of Codex is running? 0.58.0 ### What subscription do you have? API ### Which model were you using? gpt-5-codex medium and gpt-5.1-codex medium ### What platform is...