zgj-gutou
zgj-gutou
Hi, friend, I have solved it. When you init the swin_transformer.SwinTransformer() , you need to set the num_classes as 21841 instead of 1000(it is default) , because it is "22k"...
> BriVL uses the Bottom-Up Attention model as its object detection tool, this model can be obtained from[BriVL-BUA-applications](https://github.com/chuhaojin/BriVL-BUA-applications) hi, I used [BriVL-BUA-applications](https://github.com/chuhaojin/BriVL-BUA-applications) to get the bboxes. I modified the extract-bua-caffe-r101.yaml...
> > > BriVL uses the Bottom-Up Attention model as its object detection tool, this model can be obtained from[BriVL-BUA-applications](https://github.com/chuhaojin/BriVL-BUA-applications) > > > > > > hi, I used [BriVL-BUA-applications](https://github.com/chuhaojin/BriVL-BUA-applications)...