AnyDoor Question about MVImageNet Dataset (3.4 TB)

Dear authors,

Could you please tell me how many images have been used for MVImageNet?

It is said on their web that: "MVImgNet contains 6.5 million frames from 219,188 videos, the total size is about 3.4 TB." So I just wondering have you used this huge full data (3.4 TB) to train AnyDoor to achieve the reported performance?

Jun 12 '24 15:06 trungpx

No, we only use the subset with segmentation masks

Jun 13 '24 09:06 XavierCHEN34

Thanks so much for your reply. Could you help to elaborate more a confusion below?

In the paper MVImageNet, Table 1 lists 104,261 segmentations. Figure 1. MVImageNet paper

In AnyDoor paper, Table 1 lists as follows: Figure 2. AnyDoor paper

It means that AnyDoor used full 104,261 segmentations which corresponding to 219,188 videos. Is it correct? Could you share an estimated number of videos have been used so that I can download the proper ones? Since I looked up their datasets, it contains a lot of huge files, really heavy if download all of them.

Figure 3. Dataset download page

Jun 13 '24 10:06 trungpx

I have same doubts . Have you solved this problem yet?

Sep 14 '24 14:09 Neves-z

Not yet, I don't know which is the correct one to use.

Sep 14 '24 14:09 trungpx

Same problem here

Oct 01 '24 03:10 mao-code

@XavierCHEN34 Looking forward for the detail information. Thank you very much!

Oct 02 '24 02:10 mao-code