FoundationPose Output Discrepancy in run

Description: When running run_ycb.py on the BOP testing data, a small proportion of the generated outputs display a discrepancy in sign compared to the ground truth rotational or translational matrices.

Problem Statement: Approximately 0.03% of the outputs from run_ycb.py exhibit a sign discrepancy in rotational or translational matrices when compared to the ground truth data.

Request for Clarification:

Is the observed discrepancy within the expected margin of error for the algorithm or is there any mistake from my end while testing?

Apr 23 '24 18:04 Sar-thak-3

this does not look correct, especially if the translation negates, that can be way off. When did those cases happen?

Apr 24 '24 00:04 wenbowen123

Hi, thank you @wenbowen123 for your support, I am attaching the yaml output file generated after run_ycb.py execution. Here you can see in the very first predicted output have the discrepancy in the ground truth and predicted result. https://drive.google.com/file/d/1AOPjhIhh4AcyOldk8AnumJc8_PxlVBtv/view?usp=sharing

I used this following python script to generate the map for showing where the discrepancies are present in the whole predictions.

import yaml
import json
import numpy as np

def check_opposite_signs(array1, array2):
    # Check if any corresponding index elements have opposite signs
    return np.all(np.sign(array1) != np.sign(array2))

def extract_4x4_arrays_from_yaml(yaml_file):
    with open(yaml_file, 'r') as file:
        yml_data = yaml.safe_load(file)
    
    
    opposite_sign_map = {}

    total_count = 0
    opp_sign_count = 0

    for key, value in yml_data.items():
        video_dir = f"0000{key}"
        with open(f"scene_gt_{video_dir}.json", 'r') as file:
            json_data = json.load(file)
        
        opposite_sign_map[video_dir] = {}

        for key1,value1 in value.items():
            img_id = key1.lstrip('0')

            opposite_sign_map[video_dir][img_id] = {}

            i = 0
            for key2,value2 in value1.items():

                total_count += 1

                wrong = False

                obj_id = int(key2)

                cam_r = np.array(json_data[img_id][i]["cam_R_m2c"]).reshape((3,3))
                cam_t = np.array(json_data[img_id][i]["cam_t_m2c"])
                object_id_from_json = json_data[img_id][i]["obj_id"]

                four_cross_array = np.array(value2)

                opposite_sign_map[video_dir][img_id][obj_id] = []

                if(obj_id==object_id_from_json):
                    if(check_opposite_signs(cam_r[:,0],four_cross_array[:3,0])):
                        opposite_sign_map[video_dir][img_id][obj_id].append("0R")
                        wrong = True
                    if(check_opposite_signs(cam_r[:,1],four_cross_array[:3,1])):
                        opposite_sign_map[video_dir][img_id][obj_id].append("1R")
                        wrong = True
                    if(check_opposite_signs(cam_r[:,2],four_cross_array[:3,2])):
                        opposite_sign_map[video_dir][img_id][obj_id].append("2R")
                        wrong = True
                    if(check_opposite_signs(cam_t,four_cross_array[:3,3])):
                        opposite_sign_map[video_dir][img_id][obj_id].append("T")
                        wrong = True

                opp_sign_count += wrong


    
    return opposite_sign_map,opp_sign_count/total_count

# Example usage:
yaml_file = "ycbv_res.yml"
opposite_sign_map, wrong_sign_prob = extract_4x4_arrays_from_yaml(yaml_file)
print(wrong_sign_prob)
# print(opposite_sign_map)
json_file_path = "opposite_map_all.json"

# Export the map to JSON file
with open(json_file_path, 'w') as json_file:
    json.dump(opposite_sign_map, json_file, indent=4)

And this generate this output. opposite_map_all.json

Format of json file

{
    "video_dir": {
        "image_id": {
            "object_id": [
                "0R means opposite signs in 1st column of rotation matrix"
                "1R means opposite signs in 2nd column of rotation matrix"
                "2R means opposite signs in 3rd column of rotation matrix"
                "T means opposite signs in translation matrix"
            ]
        }
    }
}

Apr 24 '24 14:04 Sar-thak-3

thanks, I will check after finishing a deadline. For those abnormal scenes, can you select one and ONLY run on it and check the viz? You can increase the debug level >=3 for more verbose logging.

Apr 25 '24 18:04 wenbowen123

FoundationPose
FoundationPose copied to clipboard

Output Discrepancy in run_ycb.py Testing

Format of json file

FoundationPose FoundationPose copied to clipboard

Output Discrepancy in run_ycb.py Testing

Format of json file

FoundationPose
FoundationPose copied to clipboard