vggt icon indicating copy to clipboard operation
vggt copied to clipboard

How to get the Feature Confidence?

Open choijm007 opened this issue 7 months ago • 5 comments

We are trying to get the feature confidence from the featrue_extractor(DPTHead).

Image

So we tried to change the feature_only to False and faced the error below

RuntimeError: Error(s) in loading state_dict for VGGT: size mismatch for track_head.feature_extractor.scratch.output_conv1.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 128, 3, 3]). size mismatch for track_head.feature_extractor.scratch.output_conv1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).

Is there any way to get the feature confidence?

choijm007 avatar May 19 '25 09:05 choijm007

You can just save the feature before the final layers where confidence is returned. Then, you can return both feature and its corresponding confidence.

min-hieu-netropy avatar May 20 '25 03:05 min-hieu-netropy

@min-hieu-netropy

Can you explain this sentence "You can just save the feature before the final layers where confidence is returned." ?

In the definition '_forward_impl' of dpt_head.py , if feature_only is True '_forward_impl' returns 128 demension Feature Map. But, if feature_only is False '_forward_impl' returns 3 demension Map and 1 demension confidence of Map. And 3 demension Map and 1 demension confidence of Map depend on the activation as you can see in point_head and depth_head.

But, We can't understand your words because there is no confidence of Feature Map in the returns of definition 'forward' of dpt_head.py .

Thank you for comment. In addition, I like your Mast3r-SLAM! Praise the sun!

PR5GRAMM2R avatar May 20 '25 05:05 PR5GRAMM2R

You can add another flag that saves the out before activation like this

        feat = out.clone() # <- save for return 
        out = self.scratch.output_conv2(out)
        preds, conf = activate_head(out, activation=self.activation, conf_activation=self.conf_activation)

        preds = preds.view(B, S, *preds.shape[1:])
        conf = conf.view(B, S, *conf.shape[1:])
        return feat, preds, conf # return feat with conf

min-hieu-netropy avatar May 20 '25 05:05 min-hieu-netropy

@min-hieu-netropy

From this code, I think that you consider Feature Map's confidence and 3D Point Map's confidence equal. Is it right to think that those two are equal?

Thanks for your reply.

PR5GRAMM2R avatar May 20 '25 07:05 PR5GRAMM2R

I'm not sure what is Feature Map's confidence is? The confidence is obtained from training point map but there is no "ground truth" feature maps so feature map's confidence is not well defined here as far as I understand. But in any cases, the feature map and the point map should be correlated spatially anyways since it's only a conv2d layer between them.

min-hieu-netropy avatar May 20 '25 07:05 min-hieu-netropy