segment-anything icon indicating copy to clipboard operation
segment-anything copied to clipboard

can use multi-boxes prompt to fine-tune SAM?

Open wu2233 opened this issue 5 months ago • 0 comments

 Hello, I hope to use multi-boxes prompts to fine-tune SAM (not for prediction). Assuming my training batch size is set to 2, that is, two images, and each image has 3 prompt boxes, so I created my prompt tensor with input_boxes = torch.randn(2,3,4).to('cuda'), but I encountered this error in the prompt_encoder.py:

sparse_embeddings = torch.cat([sparse_embeddings, box_embeddings], dim=1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2 but got size 6 for tensor number 1 in the list. I am not sure the shape of 'input boxes tensor' should be (3,4) or (2,3,4). The former is OK for the program, but the latter is throwing the error. I hope to get some help, thank you.

wu2233 avatar Sep 18 '24 08:09 wu2233