colmap icon indicating copy to clipboard operation
colmap copied to clipboard

Instance / Semantic Segmentation

Open abrahamezzeddine opened this issue 10 months ago • 5 comments

I’m currently working with COLMAP, and I love it!

My plan is to perform object segmentation on the same set of images.

My goal is to integrate these two processes. First, I will use COLMAP to create a 3D sparse/dense model from the original, unsegmented images. The segmentation model will on the other hand process same set of images to extract specific object pixel coordinates.

The next step involves a comparative analysis where I’ll match the pixel coordinates from the segmented images with the pixels detected by COLMAP and cross reference them with the point cloud. This will allow me to annotate the COLMAP-generated 3D point cloud model with semantic labels based on the segmentation data.

I’m particularly interested in understanding how to efficiently extract the detected points and their corresponding pixel coordinates from COLMAP for this purpose.

I hope my task is explained in an easy way. Thank you. Have a good day.

abrahamezzeddine avatar Apr 27 '24 14:04 abrahamezzeddine

you can check this document:

https://colmap.github.io/format.html

What you want is saved in points3D.txt and images.txt

jytime avatar Apr 27 '24 14:04 jytime

Or just pycolmap to inspect the resulting COLMAP model and fuse the semantic information of the different observations ofr each 3D point: https://github.com/colmap/colmap/tree/main/pycolmap#reconstruction-object

sarlinpe avatar Apr 28 '24 16:04 sarlinpe

@sarlinpe any examples on how to do it?

I would love to run a GroundedSAM over the input images and re-project on the 3D point cloud

lucascassiano avatar May 02 '24 17:05 lucascassiano

For regular semantic segmentation:

import pycolmap
rec = pycolmap.Reconstruction("path/to/sfm/folder")
p3d_to_observations = defaultdict(list)
for image in rec.images.values():
  segmentation = get_segmentation(image.name)  # class probabilities, (H, W, C)
  for p2d_id in image.get_valid_point2D_ids():
    p2d = image.points2D[p2d_id]
    xy = p2d.xy
    ij = tuple(xy.astype(int))
    semantics = segmentation[ij]
    p3d_to_observations[p2d.point3D_id].append(semantics)
p3d_to_fused = {k: np.mean(v, 0) for k, v in p3d_to_observations.items()}  # fuse the semantic probs
p3d_to_class = {k: np.argmax(v) for k, v in p3d_to_fused.items()}  # pick top prob

sarlinpe avatar May 02 '24 18:05 sarlinpe

@sarlinpe any examples on how to do it?

I would love to run a GroundedSAM over the input images and re-project on the 3D point cloud

Any success with this?

abrahamezzeddine avatar Jun 19 '24 17:06 abrahamezzeddine