InstantMesh icon indicating copy to clipboard operation
InstantMesh copied to clipboard

How to Obtain the 'good_objs' List or Filtering Code for Objaverse Dataset

Open joeybchen opened this issue 1 year ago • 3 comments

Hi,

First of all, thank you for your incredible work on this project! I am working on training the LRM model, but I am having some trouble with the dataset filtering process mentioned in the code.

In src.data.objaverse, you define the ObjaverseData class, which uses a JSON file (valid_paths.json) containing a list of 'good_objs':

class ObjaverseData(Dataset):
    def __init__(self,
        root_dir='objaverse/',
        meta_fname='valid_paths.json',
        input_image_dir='rendering_random_32views',
        target_image_dir='rendering_random_32views',
        input_view_num=6,
        target_view_num=4,
        total_view_n=32,
        fov=50,
        camera_rotation=True,
        validation=False,
    ):
        self.root_dir = Path(root_dir)
        self.input_image_dir = input_image_dir
        self.target_image_dir = target_image_dir

        self.input_view_num = input_view_num
        self.target_view_num = target_view_num
        self.total_view_n = total_view_n
        self.fov = fov
        self.camera_rotation = camera_rotation

        with open(os.path.join(root_dir, meta_fname)) as f:
            filtered_dict = json.load(f)
        paths = filtered_dict['good_objs']
        self.paths = paths
        
        self.depth_scale = 6.0
            
        total_objects = len(self.paths)
        print('============= length of dataset %d =============' % len(self.paths))

Would it be possible for you to either:

  • Share the valid_paths.json file (or just the 'good_objs' list)?
  • Or, provide the code you used to generate this list based on the filtering criteria mentioned in the paper?

I am aware of the filtering goals described in the paper (sec 3.2), which include: """ The filtering goal is to remove objects that satisfy any of the following criteria: (i) objects without texture maps, (ii) objects with rendered images occupying less than 10% of the view from any angle, (iii) including multiple separate objects, (iv) objects with no caption information provided by the Cap3D dataset, and (v) low-quality objects. """ However, I am unsure how to translate these criteria into the correct filtering logic for generating the 'good_objs' list. Any guidance or clarification would be greatly appreciated!

Thank you so much for your help!

joeybchen avatar Oct 16 '24 10:10 joeybchen

Hi there! did you find anything? How about the rendering_random_32views too?

mohamed2020m avatar Nov 16 '24 15:11 mohamed2020m

https://github.com/Mrguanglei/Instantmesh_scriptData

Mrguanglei avatar Dec 27 '24 08:12 Mrguanglei

https://github.com/Mrguanglei/Instantmesh_scriptData

thanks, I will try it!

mohamed2020m avatar Dec 27 '24 14:12 mohamed2020m