CLEVRER Mapping from 3D object coordinates to 2D pixel coordinates

Mapping from 3D object coordinates to 2D pixel coordinates

Open Natithan opened this issue 2 years ago • 1 comments

Hi,

I want to find the function that maps a 3D object coordinate as might be found in the annotation (e.g. in CLEVRER/train/annotation_train/annotation_00000-01000/annotation_00000.json['motion_trajectory'][0]['objects'][0]['location'], which looks like e.g. (coord_x, coord_y, coord_z) = (1.3234, 2.7147, 0.2) ) to a (pix_height,pix_width) in the 320x480 pixel output image.

I can approximate it roughly linearly with

       pix_width= int(coord_y * 80 + 240)
       pix_height= int(coord_x * 34 + 160)

but it seems the relation isn't exactly linear:

Hence, I was wondering if there is some ground-truth mapping I overlooked somewhere which could get me the exact mapping :)

Thanks!

Jun 15 '22 15:06 Natithan

As a follow up question: The readme mentions that visual masks can be found here. I was thinking I could use visual mask annotations as a way to get object-center-pixel-locations. When I extract the tar gz at that link, I get a list of json files for each video, where each json looks like this: I couldn't figure out where the mask information is stored here; I'm thinking it's maybe in the 'counts' field with the long random string value (yellow highlight), but I'm not sure how to decode that string. Could you help with this? :)

Jun 16 '22 15:06 Natithan

CLEVRER CLEVRER copied to clipboard

Mapping from 3D object coordinates to 2D pixel coordinates

CLEVRER
CLEVRER copied to clipboard