heyoeyo comments

Results 186 comments of


                                            heyoeyo

How to read the code of SAM

Usually it's best to start by reading the paper associated with these models. For SAM, you can find the link to the paper on the github page, or get it...

Can SAM be used to segment grayscale image?

> Can SAM be used to segment grayscale image with only one channel? As-is, the model requires input images with 3 channels (RGB or BGR). So you'd need to convert...

Use multiple boxes and multiple points for SAM prompts

The error you're getting is likely due to not having the right shape for the input points & labels. The shape of the inputs when using the `predict_torch` function should...

nonzero MAX_INT

If you don't mind modifying the code, there is a related issue (#554) and corresponding code fix/pull request (#569) that may fix the problem. If you don't want to modify...

Can't load pre-train model from local disk

The error: "Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same" Is just saying that the input (the image) is on the gpu (cuda) whereas the model weights...

Why use the features of last 4 layers?

I've played around with the 4 image encoder outputs and found that the results are not especially sensitive to throwing away some of the outputs. For example, for vit-large, if...

Why use the features of last 4 layers?

I've done a few more experiments with this and found that the Depth-Anything vit-large model can consistently generate these hi-detail outputs by scaling the [fusion steps](https://github.com/LiheYoung/Depth-Anything/blob/6e780749e7772e911754a4eb00965727987f92f7/depth_anything/dpt.py#L127C9-L130C61). For example, for vit-l,...

Why use the features of last 4 layers?

> Btw, when only using the final fusion block, did you use it as... I can't remember exactly, but I think I did something equivalent to: ```python layer_1_rn = self.scratch.layer1_rn(layer_1)...

Unable to load pre-trained model, unexpected keyword argument 'antagonists'(has been resolved)

It looks like the pytorch interpolate function got updated between versions [1.10](https://pytorch.org/docs/1.10/generated/torch.nn.functional.interpolate.html) and [1.11](https://pytorch.org/docs/1.11/generated/torch.nn.functional.interpolate.html) to include the `antialias` keyword. So most likely, the error you're getting is from using an...

Are the depth scaling and translation parameters stable in a video ?

Based on the video samples on their [project page](https://depth-anything.github.io/), the depth result is still visibly flickery on videos (and therefore the scaling/shift parameters would be changing), though it seems more...