EnYu

Results 6 comments of EnYu

Thanks a lot. I also have another problem. There will be CUDA out of memory when I inference ytvos, however I can train the model in ytvos normally. Is there...

Hi, Merlin is the pretrained weights and Merlin-Chat is the weights after SFT. The entire training process is conducted on 64 NVIDIA A800 GPUs, with approximately 12 hours required for...

We apologize for the time constraints; we have not yet organized the code to support multi-round, multi-frame video demos. However, at this stage, we support single-round dialogues, and you can...

Hi, Thank you for your attention to our work. We will open-source the Merlin-Chat SFT data after ECCV. Stay tuned for further updates.

Thanks for your attention. Yeah, you need to download the whole files of vicuna-v15. CUDA version is cuda 11.8.

Thanks for your attention. There is no original clip-vit-large-patch14-448 on the hugging face. We employed a positional embedding interpolation to adapt the original 224x clip-vit to support an input resolution...