multimodal-maestro
multimodal-maestro copied to clipboard
issue with masks_to_marks mapping
Search before asking
- [X] I have searched the Multimodal Maestro issues and found no similar bug report.
Bug
Hi,
First and foremost thanks for your nice work so far.
I was testing your code with your google collab tutorial, and the mark creation (SAM), visualization and refining goes smoothly. Also the prompt call with marks to gpt4 goes well without any issue and I get response back.
In the part that you try to extract and visualize relevant marks, the resultset of masks_to_marks throws the error shown below.
With the example I used I expect a large output (20-30 marks), if this helps.
Environment
0.1.0rc1 Google collab (T4 vm)
Minimal Reproducible Example
masks = maestro.extract_relevant_masks(text=response, detections=refined_marks)
masks = np.array([mask for mask in masks.values()])
detections = maestro.masks_to_marks(masks)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[<ipython-input-61-a9e5dd9e84f7>](https://localhost:8080/#) in <cell line: 3>()
1 masks = maestro.extract_relevant_masks(text=response, detections=marked_image)
2 masks = np.array([mask for mask in masks.values()])
----> 3 detections = maestro.masks_to_marks(masks)
3 frames
[/usr/local/lib/python3.10/dist-packages/supervision/detection/core.py](https://localhost:8080/#) in _validate_mask(mask, n)
27 )
28 if not is_valid:
---> 29 raise ValueError("mask must be 3d np.ndarray with (n, H, W) shape")
30
31
ValueError: mask must be 3d np.ndarray with (n, H, W) shape
Additional
No response
Are you willing to submit a PR?
- [ ] Yes I'd like to help by submitting a PR!
Hi @dokooh! 👋🏻 So you used our official Colab with no code changes? You only uploaded your own image?
Hi @dokooh! 👋🏻 So you used our official Colab with no code changes? You only uploaded your own image?
Correct.
Okay, I just ran the code to confirm it works on our example. It works. Now, let's Try to debug yours.
- Could you share your version of Google Colab? The one with your example image?
- Alternatively, add print statement here:
masks = maestro.extract_relevant_masks(text=response, detections=refined_marks)
masks = np.array([mask for mask in masks.values()])
>>> print(masks)
detections = maestro.masks_to_marks(masks)
Okay, I just ran the code to confirm it works on our example. It works. Now, let's Try to debug yours.
- Could you share your version of Google Colab? The one with your example image?
- Alternatively, add print statement here:
masks = maestro.extract_relevant_masks(text=response, detections=refined_marks) masks = np.array([mask for mask in masks.values()]) >>> print(masks) detections = maestro.masks_to_marks(masks)
Detections(xyxy=array([[448, 374, 504, 430],
[339, 267, 364, 292],
[195, 5, 224, 34],
[424, 88, 453, 117],
[496, 145, 521, 170],
[249, 159, 273, 184],
[ 31, 194, 60, 224],
[392, 57, 421, 87],
[245, 268, 274, 297],
[539, 84, 564, 109],
[506, 405, 591, 430],
[507, 90, 531, 114],
[453, 17, 499, 58],
[379, 0, 404, 31],
[ 0, 141, 122, 319],
[189, 84, 214, 110],
[354, 173, 378, 199],
[274, 117, 320, 157],
[355, 69, 380, 114],
[274, 212, 320, 252],
[189, 179, 214, 221],
[450, 111, 498, 130],
[189, 179, 214, 204],
[466, 75, 481, 90],
[355, 69, 380, 94],
[280, 45, 309, 74],
[353, 173, 379, 209],
[457, 56, 490, 67],
[ 0, 0, 194, 63],
[530, 0, 555, 11],
[287, 88, 302, 104],
[274, 212, 320, 281],
[416, 33, 426, 42],
[581, 33, 591, 118],
[ 4, 196, 11, 223],
[ 0, 323, 50, 394],
[584, 363, 591, 376],
[ 42, 371, 50, 394],
[ 0, 322, 72, 394],
[562, 0, 571, 24],
[280, 45, 309, 116],
[189, 84, 214, 138],
[331, 229, 367, 240],
[562, 0, 582, 24],
[228, 109, 259, 117],
[ 64, 103, 93, 118],
[317, 275, 326, 293],
[ 0, 194, 60, 224],
[409, 323, 502, 332],
[238, 227, 249, 237],
[244, 29, 368, 38],
[287, 275, 302, 290],
[516, 58, 564, 66],
[500, 0, 559, 28],
[332, 134, 368, 145],
[278, 251, 311, 262],
[558, 114, 567, 128],
[ 0, 91, 12, 115],
[406, 134, 497, 142],
[397, 221, 489, 229],
[ 0, 6, 91, 14],
[229, 216, 255, 223],
[452, 0, 500, 8],
[504, 32, 540, 43],
[340, 254, 433, 263],
[320, 161, 329, 180],
[517, 47, 561, 55],
[ 0, 139, 323, 319],
[121, 70, 123, 119],
[229, 121, 255, 128],
[409, 308, 502, 319],
[278, 156, 311, 167],
[ 44, 348, 58, 357],
[504, 33, 511, 42],
[408, 307, 503, 332],
[228, 206, 258, 213],
[ 0, 50, 66, 60],
[340, 149, 433, 157],
[259, 225, 268, 239],
[ 0, 90, 16, 135],
[244, 388, 252, 403],
[228, 109, 260, 129],
[513, 33, 540, 43],
[544, 173, 557, 182],
[238, 132, 249, 142],
[574, 0, 582, 24],
[ 75, 343, 89, 357],
[176, 245, 219, 253],
[404, 254, 432, 262],
[329, 105, 355, 113],
[228, 205, 258, 223],
[370, 0, 438, 42],
[ 51, 124, 144, 134],
[515, 46, 564, 66],
[259, 130, 268, 145],
[317, 267, 364, 293],
[176, 245, 267, 253],
[ 82, 323, 83, 329],
[436, 47, 459, 85],
[ 81, 323, 83, 329],
[519, 59, 531, 66],
[400, 101, 414, 110],
[340, 254, 400, 262],
[404, 13, 434, 30],
[340, 149, 398, 157],
[462, 94, 489, 108],
[448, 374, 591, 430],
[ 0, 6, 61, 14],
[193, 230, 198, 235],
[244, 29, 319, 37],
[192, 204, 197, 225],
[244, 30, 254, 37],
[239, 130, 268, 145],
[192, 110, 197, 130],
[450, 75, 498, 129]]), mask=array([[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],
[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],
[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],
...,
[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],
[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],
[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]]]), confidence=None, class_id=None, tracker_id=None)
@dokooh, I can't open it.
@dokooh, I can't open it.
added your email, let me know if you still can't open so I download the ipynb file and share it.
I need your image as well. Could you upload it here?
I need your image as well. Could you upload it here?
@dokooh looks like after refinement, you lost all makes. I will add a fix for that edge case. As a workaround try skipping maestro.refine_marks
step.
The maestro
project has pivoted in the direction of a VLM fine-tuning toolkit. As a result, I am closing legacy issues.