Error when using Roma
Hi, and thanks for this code! I have installed in conda and am running the command:
python main.py --dir assets/test --pipeline roma
Features are extracted, but the matcher fails with:
2024-10-04 10:33:28 | [INFO ] Features extracted!
2024-10-04 10:33:28 | [INFO ] Matching features with roma...
2024-10-04 10:33:28 | [INFO ] roma configuration:
{'name': 'roma', 'pretrained': 'outdoor'}
2024-10-04 10:33:28 | [INFO ] Matching features...
2024-10-04 10:33:28 | [INFO ]
0%| | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\B\deep-image-matching\main.py", line 49, in <module>
match_path = img_matching.match_pairs(feature_path)
File "C:\Users\B\deep-image-matching\src\deep_image_matching\image_matching.py", line 427, in match_pairs
self._matcher.match(
File "C:\Users\B\deep-image-matching\src\deep_image_matching\matchers\roma.py", line 89, in match
matches = self._match_pairs(self._feature_path, img0, img1)
File "F:\Conda\envs\deep-image-matching\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "C:\Users\B\deep-image-matching\src\deep_image_matching\matchers\roma.py", line 165, in _match_pairs
warp, certainty = self.matcher.match(
File "F:\Conda\envs\deep-image-matching\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "C:\Users\B\deep-image-matching\src\deep_image_matching\thirdparty\RoMa\roma\models\matcher.py", line 709, in match
im_A, im_B = test_transform((im_A, im_B))
File "C:\Users\B\deep-image-matching\src\deep_image_matching\thirdparty\RoMa\roma\utils\utils.py", line 292, in __call__
im_tuple = t(im_tuple)
File "C:\Users\B\deep-image-matching\src\deep_image_matching\thirdparty\RoMa\roma\utils\utils.py", line 209, in __call__
return [self.to_tensor(im) for im in im_tuple]
File "C:\Users\B\deep-image-matching\src\deep_image_matching\thirdparty\RoMa\roma\utils\utils.py", line 209, in <listcomp>
return [self.to_tensor(im) for im in im_tuple]
File "C:\Users\B\deep-image-matching\src\deep_image_matching\thirdparty\RoMa\roma\utils\utils.py", line 194, in __call__
im = np.array(im, dtype=np.float32).transpose((2, 0, 1))
ValueError: axes don't match array
I am attaching the config here as well. What can I do to resolve this? Thanks!
Hi, thanks for reporting, are you on dev branch? Could you just try to run this basic example to see if you have the same issue:
>python ./main.py -d .\assets\example_cyprus -p roma --skip_reconstruction --force
I installed via pip, inside conda. Will check with that command and report back. Thanks!
Hi, I think I have the issue... using the cypruss dataset, things look okay, using a custom dataset where the images are of different sizes, I get the error.
Could the different sizes be the problem here? How can I get around it?
Hi, what is the size of the images?
it is a 3 image dataset.
2 x 1280 x 800 1 x 1920 x 1080
more testing... actually even if i remove the 1920x1080 image, i see the same error. When using Loftr, it runs perfectly. This is specific to a roma pipeline
Are your images grayscale or RGB?
greyscale
Could you try with any RGB images that you have? Maybe this is the issue. In that case we could convert grayscale to RGB and this should solve the problem
That resolved it. Thank you!
I have a workflow question if you have a minute...
My dataset is 3 images, two are from a stereo camera, and one is from a completely different camera.
As I know the extrinsics for the stereo camera, I am doing the following steps:
Match the stereo pair with Roma Create an initial model with colmap run rigbundleadjuster to constrain the model from the stereo intrinsics and give world scale.
Now I need to add the third image to the dataset and bundle adjust it. What is the best way to do this?
-- the reason i am going to all this trouble is that when throwing all images in at once, the resulting camera poses are incorrect (I am more interested in camera poses than points for this case)
Thanks!
Hi, have you tried with superpoint instead of Roma? You know the relative pose between the two stereo cameras, but do you know reliable intrinsics and distortions parameters for the three cameras?
Hi, Superpoint seems to initialise easier with the dataset and looks good when running just the stereo pair. Running all three images at once still gives incorrect results. I have intrinsics and extrinsics for the stereo pair. The goal here is to find the relative transform to the third camera.
When you say incorrect do you mean completely incorrect with a certain error but not so big? In general for these kind of procedures a good solution is to move the triplet of cameras around a scene with good texture, so the final estimate will be reliable
Thank you! To link images with their camera, is the best way to create a yaml config? Or is there a each-camera-has-a-image-directory option?
There are different ways to go, the easier is to put every camera in a separate subfolder, then when you run rig_bundle_adjuster you have to pass a config file like the following. For instance cam0 is the name of the first subfolder. The two lines cam_from_rig_rotation and cam_from_rig_translation are optional. If you do not put them, COLMAP will try to estimate the relative poses. In your case this is the scenario for the third camera.
[
{
"ref_camera_id": 1,
"cameras":
[
{
"camera_id": 1,
"image_prefix": "cam0",
"cam_from_rig_rotation": [1, 0, 0, 0],
"cam_from_rig_translation": [0, 0, 0]
},
{
"camera_id": 2,
"image_prefix": "cam1",
"cam_from_rig_rotation": [1, 0, 0, 0],
"cam_from_rig_translation": [0.120, 0, 0]
}
]
}
]
Hi, I close the issue, feel free to reopen if you find any other issue or feel free to collaborate to the project!