TokenPacker
TokenPacker copied to clipboard
Visual projector for other VIT
As a general visual projector, I'd like to ask whether you have conducted any experiments on other visual backbone. I extracted features from the siglip [17, 18, 26, 27] layers and found the results to be unsatisfactory.
作为一个通用的视觉投影仪,我想问你是否对其他视觉主干进行了任何实验。我从 siglip [17, 18, 26, 27] 层中提取了特征,发现结果并不令人满意。
还有试过其他的实验吗,效果如何呢?