Xiaosong Zhang comments

Results 9 comments of


                                            Xiaosong Zhang

RuntimeError when training

I think the problem may be in the INPUT setting, it's best to use a scale that keeps the horizontal/vertical ratio in object detection, so I recommend using (MIN_SIZE=800,MAX_SIZE=1333), or...

positive_loss is much larger than negative_loss

RetinaNet and FreeAnchor initialize the classifier bias to make it predict lower scores, so the negative loss is very small.

AttributeError: 'tuple' object has no attribute 'values'

Please check your pytorch version, we need pytorch >= 1.1, as described in [INSTALL.md](https://github.com/zhangxiaosong18/FreeAnchor/blob/master/INSTALL.md)

AttributeError: 'Image' object has no attribute 'shape'

This may be due to an incorrect `torchvision` version, please use `torchvision 0.2.1`.

Does FreeAnchor require artificially designed anchor boxes?Such as anchor_rations=[0.5, 1.0, 2.0]

Yes, it require hand-crafted anchors.

Did you follow the INSTALL.md installation steps? `python setup.py build develop` will compile a dynamic link library(*.so) for the python code. If you can‘t find this file in `maskrcnn_benchmark` directory,...

how long it takes to train coco data?

It took about 10 hours to train free_anchor-R-50-FPN_1x with 8 RTX2080Ti

There seems to be some discrepancy between the cogvlm indicators posted in the paper.

Emu2-Chat is a generalist model, and we use the Generalist Performance from Table 3 of the [CogVLM arXiv paper](https://arxiv.org/pdf/2311.03079.pdf) instead of single-task performance. ![e7d01101-64a3-4314-ba29-6c010629b562](https://github.com/baaivision/Emu/assets/42880203/33a3fd20-d458-40ae-ad42-f34ddf781118)

Eval Dataset

They are VQAv2 annotation files processed by [LAVIS](https://github.com/salesforce/LAVIS) and can be downloaded from https://storage.googleapis.com/sfr-vision-language-research/LAVIS/datasets/vqav2/vqa_val_eval.json and https://storage.googleapis.com/sfr-vision-language-research/LAVIS/datasets/vqav2/vqa_test.json