heyoeyo

Results 186 comments of heyoeyo

Usually it's best to start by reading the paper associated with these models. For SAM, you can find the link to the paper on the github page, or get it...

> Can SAM be used to segment grayscale image with only one channel? As-is, the model requires input images with 3 channels (RGB or BGR). So you'd need to convert...

The error you're getting is likely due to not having the right shape for the input points & labels. The shape of the inputs when using the `predict_torch` function should...

If you don't mind modifying the code, there is a related issue (#554) and corresponding code fix/pull request (#569) that may fix the problem. If you don't want to modify...

The error: "Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same" Is just saying that the input (the image) is on the gpu (cuda) whereas the model weights...

I've played around with the 4 image encoder outputs and found that the results are not especially sensitive to throwing away some of the outputs. For example, for vit-large, if...

I've done a few more experiments with this and found that the Depth-Anything vit-large model can consistently generate these hi-detail outputs by scaling the [fusion steps](https://github.com/LiheYoung/Depth-Anything/blob/6e780749e7772e911754a4eb00965727987f92f7/depth_anything/dpt.py#L127C9-L130C61). For example, for vit-l,...

> Btw, when only using the final fusion block, did you use it as... I can't remember exactly, but I think I did something equivalent to: ```python layer_1_rn = self.scratch.layer1_rn(layer_1)...

It looks like the pytorch interpolate function got updated between versions [1.10](https://pytorch.org/docs/1.10/generated/torch.nn.functional.interpolate.html) and [1.11](https://pytorch.org/docs/1.11/generated/torch.nn.functional.interpolate.html) to include the `antialias` keyword. So most likely, the error you're getting is from using an...

Based on the video samples on their [project page](https://depth-anything.github.io/), the depth result is still visibly flickery on videos (and therefore the scaling/shift parameters would be changing), though it seems more...