kat icon indicating copy to clipboard operation
kat copied to clipboard

Incompatibility between tenser shape [8, 197, 768] and [8, 196, 768] for 2D segmentation

Open alqurri opened this issue 10 months ago • 0 comments

Hi;

The result from the transformer before the "forward_head" have shape [8, 197, 768] , however, if we would like to use it for segmentation, other model such as TransUnet have shape [8, 196, 768]. This is important to get the square of 196 and convert into height and width for 2D. I notice it gets this shape after calling "x = self._pos_embed(x)." How could we convert the tenser [8, 197, 768] shape to [8, 196, 768]. Could we simply extract the first 196 vector or last 196?

alqurri avatar Feb 11 '25 22:02 alqurri