Xin Wen
Xin Wen
Hi Oliver, Recently I reproduced BYOL and DetCon_B with the provided weights (ImageNet 1k epoch pre-training), the results are as follows: Method | COCO det | COCO seg | VOC...
Hi, Training without the background dataset may cause a significant drop in performance. As stated on the website of [VCDB](https://fvl.fudan.edu.cn/dataset/vcdb/list.htm), please drop a mail to the authors for the background...
Hi @alexcbb, thanks for your attention to our work! We actually didn't thoroughly experiment with ViTs due to computation constraints. Regarding the object-centric attention maps of DINO, we believe that...
Feel free to leave a message if there is trouble working on that.
It should took up almost all memories of 8x2080 Ti, roughly 80GB in total. I do not remember well the precise time it took for training, maybe roughly 2~3 days?
Fig 3 is simply produced using [viz_slots.py](https://github.com/CVMI-Lab/SlotCon/blob/main/viz_slots.py) with the default configs. The model is the default model on coco, with 800 epochs of training. Please check if there are any...
Hi, we scale the learning rate linearly with the batch size, as done by many previous works. This part is already implemented in the code, and basically no more modification...
Your anticipation is correct, this is to make sure they form a positive pair, such that both the query and key slots exist across views. From my memory, we didn't...
Well actually I can't recall well about the details..., you may consider dropping that pair in this case
See if this paper helps you understand the dead slots: https://openreview.net/forum?id=Z2dVrgLpsF