Hangjie Yuan comments

Results 22 comments of


                                            Hangjie Yuan

Two questions about "bbox_tools.py"

For question2, according to the original paper, I think the author set the anchor point in the center. From the perspective of mine, (0, 0) is also okay because it...

Trained DETR paramters on v-coco

But if we want to do the pretraining ourselves, what should we do? How can we find the V-COCO test set images that are contained in the COCO train2017 set?...

Is the checkpoints of all model available？

@yulin011 You can try running it on CPU. The datasets are relatively small.

The instantiation of Multi-head PA and the design choice of MAM adapter.

> Thanks for your interest! For your questions: > > 1. MH PA and PA use the same number of parameters when their `r` are the same -- in transformers...

All video/ gifs seems to have the "shutterstock" watermark at the same location.

Is it possible to remove this?

when will release InstrucVideo code?

@ruffiann Hi, many thanks for your interest in my work. I am currently preparing for the code release and will be made available as soon as possible. With respect to...

OI Eval and Training Logs

@liuhengyue Hi, can you be more specific on "You are computing the mean per GT triplet, instead of per predicate class." and "You also skip computing for "unseen" triplet."? It...

OI Eval and Training Logs

@liuhengyue For the first query, according to my scope of knowledge, it is usually computed per triplet. Usually, performance per triplet will be lower than performance per predicate since we...

@liuhengyue These are the training logs that you required. (For Swin-T, there are repetitve results due to multiple tries. Generally, they adhere to the reported results in the paper.) [RLIP_PDA_v2_OISGGtrain_SwinL_VGCOCOO365_RQL_LSE_RPL_20e_L1_20e.txt](https://github.com/JacobYuan7/RLIPv2/files/13439277/RLIP_PDA_v2_OISGGtrain_SwinL_VGCOCOO365_RQL_LSE_RPL_20e_L1_20e.txt)...

Is the model capable of detecting open-vocabulary objects, such as grounding dino?

@aixiaodewugege Hi, many thanks for your interest in my work. Yes, you've grasped the concept accurately. Due to the annotation style of Visual Genome and the pseudo-labelled Objects365, it has...