LightGlue
LightGlue copied to clipboard
Batch support
I suppose batch support is not yet implemented? The inference speed is nice but it needs to be batched in order to use 100% GPU
Hey. Batch support is there for normal inference, but not for adaptive depth/ width. However, at least for adaptive width it should be easy to add batch support. I can have a look into it the next days.
Using match_pair function with input shapes [1, 3, 480, 640] where 1 is batch size crashes the code. Maybe match_pair is not the correct function to use?
preds = match_pair(self.extractor, self.matcher, prev_frames_t, frames_t)
Exception has occurred: RuntimeError
Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [1, 1, 3, 480, 640]
Any updates on this?
Hi @marisancans,
match_pair
does not support batches, but the forward passes of DISK, SuperPoint and LightGlue support batched inputs. However, there are some problems that need to be solved:
- If the input images have different shapes, square padding or cropping is required for DISK/SuperPoint. Unfortunately, both DISK and SuperPoint tend to yield plenty of border artifacts, which could be solved by forwarding a padding mask.
- To batch keypoints/descriptors, you need to ensure that the exact same number of keypoints are detected in each image. You can try reducing the
detection_threshold
s.t. the extractors always findmax_num_keypoints
features, or pad the keypoints/descriptors with random numbers.
Assuming you have two sets of batched images, a simple pipeline could look like this (After PR #22 is merged):
extractor = SuperPoint(max_num_keypoints=1024, detection_threshold=0.0).eval().cuda()
matcher = LightGlue().eval().cuda()
feats0 = extractor({'image': im0})
feats1 = extractor({'image': im1})
matches01 = matcher({'image0': {**feats0, 'image': image0}, 'image1': {**feats1, 'image': image1}})
If detection_threshold is zero, this would then give the same number of keypoints each time, no matter what image pairs are given, right? And then the filtration could be done later on with matches01? This kind of solution would defiliently work fine to me. Looking forward the merge Thank you for responding, new publication authors usually ignore repo issues :D
Yes, setting detection_threshold=0.0
should yield the same number of keypoints every time, but there might be some corner cases where less than max_num_keypoints
have a detection score >0. With the filter_threshold
parameter you can then set a confidence threshold on the correspondences. This is also the setup we used for training LightGlue.
Thanks for reporting this issue, very glad to see people using LightGlue :)
Cant quite get it to work. Im getting index errors My inputs to matcher are: data = {'image0': {**feats0, 'image': image0}, 'image1': {**feats1, 'image': image1}} matches01 = matcher(data) 1024 is the feature count, 4 is batch size
Where data looks like:
{
"image0": {
"keypoints": (4, 1024, 2),
"keypoint_scores": (4, 1024),
"descriptors": (4, 1024, 256),
"image": (4, 3, 512, 512)
},
"image1": {
"keypoints": (4, 1024, 2),
"keypoint_scores": (4, 1024),
"descriptors": (4, 1024, 256),
"image": (4, 3, 512, 512)
}
}
Exception has occurred: IndexError
The shape of the mask [4, 1024] at index 0 does not match the shape of the indexed tensor [1, 1024] at index 0
File "/home/ma/src/nerfstudio_preprocess/LightGlue/lightglue/lightglue.py", line 400, in _forward
ind0, ind1 = ind0[mask0][None], ind1[mask1][None]
File "/home/ma/src/nerfstudio_preprocess/LightGlue/lightglue/lightglue.py", line 343, in forward
return self._forward(data)
File "/home/ma/src/nerfstudio_preprocess/lightning_system.py", line 27, in match_pair
matches01 = self.matcher(data)
File "/home/ma/src/nerfstudio_preprocess/lightning_system.py", line 70, in predict_step
r = self.match_pair(frames_t, still_batch_t, feats0, feats1[0])
File "/home/ma/src/nerfstudio_preprocess/gui/callback.py", line 42, in on_predict_batch_end
for res in output_iterator:
File "/home/ma/src/nerfstudio_preprocess/walker.py", line 220, in main
inferencer.predict(ls, dataloader)
File "/home/ma/src/nerfstudio_preprocess/walker.py", line 239, in <module>
main()
IndexError: The shape of the mask [4, 1024] at index 0 does not match the shape of the indexed tensor [1, 1024] at index 0
Here are the shapes ar line 400 in lightglue.py:
ind0.shape
torch.Size([1, 1024])
ind1.shape
torch.Size([1, 1024])
mask0.shape
torch.Size([4, 1024])
mask1.shape
torch.Size([4, 1024])
Setting filter_threshold=0.0 to LightGlue doesnt fix this too
You can reproduce this by trying:
data = {
"image0": {
"keypoints": torch.rand(4, 1024, 2).cuda(),
"descriptors": torch.rand(4, 1024, 256).cuda(),
"image": torch.rand(4, 3, 512, 512).cuda(),
},
"image1": {
"keypoints": torch.rand(4, 1024, 2).cuda(),
"descriptors": torch.rand(4, 1024, 256).cuda(),
"image": torch.rand(4, 3, 512, 512).cuda(),
}
}
matches01 = matcher(data)
Or this:
from lightglue import LightGlue, SuperPoint, DISK
import torch
extractor = SuperPoint(max_num_keypoints=1024, detection_threshold=0.0).eval().cuda()
matcher = LightGlue().eval().cuda()
im0 = torch.rand(4, 3, 512, 512).cuda()
im1 = torch.rand(4, 3, 512, 512).cuda()
feats0 = extractor({'image': im0})
feats1 = extractor({'image': im1})
matches01 = matcher({'image0': {**feats0, 'image': im0}, 'image1': {**feats1, 'image': im1}})
I can add batch support to ind0 and ind1 like this:
ind0 = torch.stack([torch.arange(0, m).to(device=kpts0.device) for x in range(b)])
ind1 = torch.stack([torch.arange(0, n).to(device=kpts0.device) for x in range(b)])
Then I get the next error:
Exception has occurred: IndexError
The shape of the mask [4, 1024] at index 0 does not match the shape of the indexed tensor [2, 4, 1, 1024, 64] at index 2
File "/home/ma/src/nerfstudio_preprocess/LightGlue/lightglue/lightglue.py", line 404, in _forward
encoding0 = encoding0[:, :, mask0][:, None]
File "/home/ma/src/nerfstudio_preprocess/LightGlue/lightglue/lightglue.py", line 343, in forward
return self._forward(data)
File "/home/ma/src/nerfstudio_preprocess/lightning_system.py", line 41, in match_pair
matches01 = self.matcher(data)
File "/home/ma/src/nerfstudio_preprocess/lightning_system.py", line 84, in predict_step
r = self.match_pair(frames_t, still_batch_t, feats0, feats1[0])
File "/home/ma/src/nerfstudio_preprocess/gui/callback.py", line 42, in on_predict_batch_end
for res in output_iterator:
File "/home/ma/src/nerfstudio_preprocess/walker.py", line 220, in main
inferencer.predict(ls, dataloader)
File "/home/ma/src/nerfstudio_preprocess/walker.py", line 239, in <module>
main()
IndexError: The shape of the mask [4, 1024] at index 0 does not match the shape of the indexed tensor [2, 4, 1, 1024, 64] at index 2
Hi @marisancans,
with the latest merge we updated the defaults to perform early stopping / pruning by default. You have to manually switch off adaptive depth and width, which lack batch-support. Then your code should work again :)
Im not sure what I did, but I managed to do it without disabling anything You can check if it is correct, because I have no idea what I did there to be honest. Here are the changes I did:
Added batch support to ind0 and ind1
Also here
And here:
Now whats left is postprocessing
Your code might run, but I am quite certain it will not yield the expected outcome. While early stopping (depth_confidence
) might work (it would compute the confidence over all image pairs in your batch, and stop if the overall confidence is reached), adaptive width cannot work this way, since we resize descriptors/encodings. I suggest validating your results by visualizing them, just like in the demo notebook.
You are right, the results are wierd and not a lot of points found. For now it looks like I got it working in batch mode, performance increase is huge. Thanks for support!
For future reference in case someone stumbles upon the same problem:
This is how I am doing batching. Get features
# Where im0 and im1 are image pairs (B, C, H, W)
extractor = SuperPoint(max_num_keypoints=1024, detection_threshold=0.0).eval().cuda()
feats0 = extractor({ "image": im0 })
feats0 = extractor({ "image": im1 })
Use matcher
custom_match_pair(im0,im1, feats0, feats1)
Do filtration of points, note that I havent yet figured out how to do scaling (but i think its possible, currently both image pairs are the same sizes)
def custom_match_pair(self, image0, image1, feats0, feats1):
data = {'image0': {**feats0, 'image': image0}, 'image1': {**feats1, 'image': image1}}
matches01 = self.matcher(data)
matches0, mscores0 = matches01['matches0'], matches01['matching_scores0']
pred = {**{k+'0': v for k, v in feats0.items()},
**{k+'1': v for k, v in feats1.items()},
**matches01}
# Maybe needed if you get funky nvidia errors
# pred = {k: v.detach() if isinstance(v, torch.Tensor) else v for k, v in matches01.items()}
# if scales0 is not None:
# pred['keypoints0'] = (pred['keypoints0'] + 0.5) / scales0[None] - 0.5
# if scales1 is not None:
# pred['keypoints1'] = (pred['keypoints1'] + 0.5) / scales1[None] - 0.5
# del feats0, feats1
# torch.cuda.empty_cache()
# create match indices
# matches0, mscores0 = pred['matches0'], pred['matching_scores0'] # matching_scores
m_kpts0_all = []
m_kpts1_all = []
matching_scores_all = []
valid = matches0 > -1
for v, m, kp0, kp1, scores in zip(valid, matches0, pred['keypoints0'], pred['keypoints1'], mscores0):
matches = torch.stack([torch.where(v)[0], m[v]], -1)
m_kpts0, m_kpts1 = kp0[matches[..., 0]], kp1[matches[..., 1]]
matching_scores = scores[v]
m_kpts0_all.append(m_kpts0)
m_kpts1_all.append(m_kpts1)
matching_scores_all.append(matching_scores)
return m_kpts0_all, m_kpts1_all, matching_scores_all
Hi @marisancans,
match_pair
does not support batches, but the forward passes of DISK, SuperPoint and LightGlue support batched inputs. However, there are some problems that need to be solved:
- If the input images have different shapes, square padding or cropping is required for DISK/SuperPoint. Unfortunately, both DISK and SuperPoint tend to yield plenty of border artifacts, which could be solved by forwarding a padding mask.
- To batch keypoints/descriptors, you need to ensure that the exact same number of keypoints are detected in each image. You can try reducing the s.t. the extractors always find features, or pad the keypoints/descriptors with random numbers.
detection_threshold``max_num_keypoints
Assuming you have two sets of batched images, a simple pipeline could look like this (After PR #22 is merged):
extractor = SuperPoint(max_num_keypoints=1024, detection_threshold=0.0).eval().cuda() matcher = LightGlue().eval().cuda() feats0 = extractor({'image': im0}) feats1 = extractor({'image': im1}) matches01 = matcher({'image0': {**feats0, 'image': image0}, 'image1': {**feats1, 'image': image1}})
hi i do pad the keypoints with 0, but i got a low accuracy like 5%. is here a better way to solve it. because of my dataset_image size is quite small, i set the max_num_keypoints = 128 to avoid the problem