heyoeyo
heyoeyo
If you mean having the output `mask_embeddings` match the size of the input `mask_tensor`, then one way would be to scale up by a factor of 4 with something like:...
> What is the reason behind the different shapes and how to get the shape --> torch.Size([1, 256, 64, 64]) The 64x64 shape is the size of the output of...
If the input mask isn't 256x256, then the embedding result won't be 64x64, since it only does a 4x downscale (i.e. it isn't scaling to 64x64 directly). You could either...
You can achieve something like this using 'morphological filtering', specifically 'dilation', which is available [using opencv](https://docs.opencv.org/3.4/db/df6/tutorial_erosion_dilatation.html). The code would be something like: ```python import cv2 import numpy as np #...
There are a couple options that might help. 1. Some simple post-processing might be good enough if you just want cleaner looking masks. In particular, morphological filtering (specifically 'closing') can...
The model requires a 1024x1024 RGB image as an input, so a (very!) large 8-10 GB image would be automatically downscaled by the model before the encoding/segmentation steps. Assuming you...
As far as I can tell, they are at least loading the image as tiles. For example, you can see the tiles loading in when they zoom out at the...
> What if we want to perform the same but with point prompting and also lets say with images lesser in size I may be misunderstanding your question, but I...
The SAM repo doesn't include support for exporting the image encoder to onnx, however there is a discussion of this in issue #16 and one of the user's there has...
I haven't used onnx with batches myself, but at least from that link it does look like the onnx model expects a batch dimension. For example, the [onnx inference notebook](https://github.com/AndreyGermanov/sam_onnx_full_export/blob/main/sam_onnx_inference.ipynb)...