RegionProxy icon indicating copy to clipboard operation
RegionProxy copied to clipboard

Region embeddings

Open Pavinags opened this issue 1 year ago • 1 comments

Where exactly do we pass the region embeddings as tokens to the transformer encoder?

All I can see is that the token and affinity both are defined at decoder head

Pavinags avatar May 24 '23 12:05 Pavinags

It seems that the author uses in_index and out_index in the config files to select one of the middle output features of Vit backbone as affinity head's input. as 68~71 lines in proxy_head.py:

def forward(self, inputs):
        x_mid, x = self._transform_inputs(inputs)  # (B, C, H, W)
        B, _, H, W = x.shape

        affinity = self.forward_affinity(x_mid)

hhd52859 avatar Jul 19 '23 06:07 hhd52859