segment-anything icon indicating copy to clipboard operation
segment-anything copied to clipboard

Multiple coords + Multiple boxes + Multiple labels solution - Need suggest

Open pnthai88 opened this issue 1 year ago • 5 comments

i implemented in my code

image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
predictor.set_image(image)
            
input_boxes = np.array(boxes) if len(boxes) > 0 else None
input_boxes = torch.tensor(input_boxes, device=predictor.device)
transformed_boxes = predictor.transform.apply_boxes_torch(input_boxes, image.shape[:2])
               
masks, scores, logits = predictor.predict_torch(
   point_coords=None,
   point_labels=None,
   boxes=transformed_boxes,
   multimask_output=False,
)

how can i use multiple coord and labels ? What format it is ? Thanks for help.

pnthai88 avatar Apr 03 '24 16:04 pnthai88

The shape of the points should be B x N x 2, labels should have a shape of: B x N and boxes should have a shape of: B x 4. Where B is the batch dimension (i.e. how many independent masks you want to generate) and N is the number of points & labels (the 2 and 4 correspond to the xy and xyxy coordinates for points and boxes respectively).

As-is, you can't have more than 1 box for a single prompt, unless you change the code. There's more info in issue #646

heyoeyo avatar Apr 03 '24 21:04 heyoeyo

The shape of the points should be B x N x 2, labels should have a shape of: B x N and boxes should have a shape of: B x 4. Where B is the batch dimension (i.e. how many independent masks you want to generate) and N is the number of points & labels (the 2 and 4 correspond to the xy and xyxy coordinates for points and boxes respectively).

As-is, you can't have more than 1 box for a single prompt, unless you change the code. There's more info in issue #646

i think the latest update that i'm able to send 2+ boxes as the code i mention. I Just can not add more coords with 2+ boxes

pnthai88 avatar Apr 04 '24 00:04 pnthai88

It should be possible to have 2 (or more) boxes, but they'll generate independent masks, since the '2' in this case will correspond to the batch part of the shape (i.e. the Bx4 shape becomes 2x4 for 2 boxes). The limitation is on generating a single mask with more than one box.

In order to run the prediction, all of the 'B' parts of the shape have to match (for points, labels & boxes) and the 'N' parts of the shape have to match for points & labels. So it's possible to share points/labels across independent boxes using something like:

num_boxes = 2
input_points = input_points.repeat((num_boxes,1,1))
input_labels = input_labels.repeat((num_boxes,1))

This is assuming the original input points/labels have shapes of Nx2 & N, respectively. It just duplicates them on the batch (B) dimension to match the boxes.

heyoeyo avatar Apr 04 '24 13:04 heyoeyo

rect_202404081640404

Great, it works

pnthai88 avatar Apr 08 '24 11:04 pnthai88

@pnthai88 Hi friend, thank you very much for your guidance and advice. However, I have a few questions I'd like to ask:

  1. In non-auto mode, do I need to manually define the points and then feed them into the model? How should I write this part of the code? I don't have a clear idea—could you please give me some guidance?
  2. If they are manually defined, should I iteratively feed multiple points for the same target into the model? I would greatly appreciate it if you could provide relevant code examples.

Gi-gigi avatar Oct 18 '24 13:10 Gi-gigi