soho issues

pretrain model of soho based on resenet101

Where is the link regarding the pre-trained model of Soho based on Resnet101?

cannot reproduce the performance of visual Entailment dataset.

4

Hi; I conduct the pretraining with resent18+3 layer transformer by using indomain data. (without MVM loss) I can get a similar result on VQA downstream tasks, around 66.5 accuracy. But...

youngfly11

It is abnormal , so many unexpected keys???

![图片](https://user-images.githubusercontent.com/16870890/139253965-7b9a5860-270a-46dd-9eeb-0a0052e0b27c.png)

alice-cool

The Accuracy of Masked Visual Modeling

1

Hi, what is your mvm accuracy of pretrained model? I only got about 30% when pretraining and wanted to know if that is normal?

mhyeh

new tool

hi, would you release a tool for visualizing the visual dictionary

mocki-zz

how to evaluate image/text retrieval on soho?

Hi, many thanks for your sharing SOHO. In Readme.MD, i can only find how to pretrain and train a VQA model. However, there is no instruction to train or evaluate...

byougert

Do you plan to release the training configurations and scripts of the pre-training?

Thanks for your great codes. This is an impressive work that may inspire many ones to follow it. Do you plan to release the training configurations and scripts of the...

Jxu-Thu

In the paper, there is a Visual Dictionary(VD) to remodel the image of query, but the class of SOHO_direct_VD(SOHO/models/necks/utils.py) only operate the image by torch.agrmax in the code, which is...

LIUYUANWEI98

soho
soho copied to clipboard

Metadata

你们这是开源了个寂寞啊。。

pretrain model of soho based on resenet101

cannot reproduce the performance of visual Entailment dataset.

It is abnormal , so many unexpected keys???

The Accuracy of Masked Visual Modeling

new tool

how to evaluate image/text retrieval on soho?

Do you plan to release the training configurations and scripts of the pre-training?

Where I can find the VD?

← Metadata

Owner

Metadata

soho soho copied to clipboard

Metadata

← Metadata

Owner

Metadata

soho
soho copied to clipboard