kat
kat copied to clipboard
Incompatibility between tenser shape [8, 197, 768] and [8, 196, 768] for 2D segmentation
Hi;
The result from the transformer before the "forward_head" have shape [8, 197, 768] , however, if we would like to use it for segmentation, other model such as TransUnet have shape [8, 196, 768]. This is important to get the square of 196 and convert into height and width for 2D. I notice it gets this shape after calling "x = self._pos_embed(x)." How could we convert the tenser [8, 197, 768] shape to [8, 196, 768]. Could we simply extract the first 196 vector or last 196?