KeyaoZhao
KeyaoZhao
> I think I have the same question. I would like to add objects to a preexisting photo scene. It seems onerous to have to define some mask, I'd want...
> Hello, I also have the same problem. I have tried to make the shape of the mask to be [h,w], [h,w,3], and [1,h,w,3] but failed all the cases. Did...
> > Hello @darkpromise98! > > Thank you for the reminder. We are about to update the repo with more code and a more detailed README, @billpsomas Hello! I'm also...
> 好的非常感谢!期待后续更详细的readme~
> Hello , I was a bit confused as the supplementary material and paper describes that image and text features are extracted in 768 dimension using CLIP , however looking...
> You could use https://huggingface.co/OpenGVLab/InternVideo2-CLIP-1B-224p-f8, it supports Chinese text search! Thanks for your reply. I use 'InternVideo2-stage2_1b-224p-f4.pt'+'1B_clip.pth' as the vision encoder, 'chinese_alpaca_lora_7b' as tokenizer, 'internvl_c_13b_224px.pth' as the text encoder, but...
> How do you set the ckpt_path?  , if you set the ckpt of InternVideo2_Stage2 to `vision_ckpt_path`, it shouldn't meet size mismatch of `text_proj.weight`. Thanks, I already solved the...
> > The url in json file cannot be opened or download. Can you update the url? Thanks a lot. > > > The dataset is still available online. You...
> OK, I may have misunderstood. I currently don’t have a way to retrieve these video links, as it requires some internal permissions and code. I’ll try to find a...
> Hi @KeyaoZhao, You can try downloading the dataset from Hugging Face: [link](https://huggingface.co/datasets/acherstyx/AutoTransition/tree/main). Yeah I downloaded it successfully. Thanks and wish u all the best~