段欣然

Results 4 comments of 段欣然

> segment_anything/modeling/mask_decoder.py -- line126-127 > > ```python > src = torch.repeat_interleave(image_embeddings, tokens.shape[0], dim=0) > src = src + dense_prompt_embeddings > ``` > > All image_embeddings will be copied 4 times....

可能我英语表述不太清楚,decoder需要将图像编码和prompt编码统一然后相加,统一的标准就是token的大小。您还有问题的话可以描述的具体一些,我看到会及时回复。 Maybe my English expression is not clear, decoder needs to unify the image encoding and prompt encoding and then add them, and the unified standard is the size of...

> same issue. > > I noticed that in [MedSAM](https://github.com/bowang-lab/MedSAM/blob/7e86549203bda3233afc50b8c8cc41c521d88c9b/segment_anything/modeling/mask_decoder.py#L126), they do the same modification as @bach05. > > compared to repeat all the embeddings, this seems to be more...

> > > segment_anything/modeling/mask_decoder.py -- line126-127 > > > ```python > > > src = torch.repeat_interleave(image_embeddings, tokens.shape[0], dim=0) > > > src = src + dense_prompt_embeddings > > > ```...