Scan2Cap icon indicating copy to clipboard operation
Scan2Cap copied to clipboard

Some puzzles about dataset processing

Open yvfengZhong opened this issue 3 years ago • 5 comments

I once encountered a problem when preprocessing the scannetv2 dataset. I tried to solve this problem, but I'm not sure whether my solution is reasonable. I'd like to discuss it with you.

When I execute the command python batch_ load_ scannet_ data.py, an error occurred.

p1

I read the file batch_load_scannet_data.py and found that the function of the file is to select the corresponding folder in the directory data/scannet/scans/ for data processing according to the directory name in the file data/scannet/meta_data/scannetv2.txt and save the generated results in the directory data/scannet/scannet_data/.

p2

I don't know if my understanding is correct.

Then, I read the file data/scannet/meta_data/scannetv2.txt and found that it contains 806 scenes. Directory data/scannet/scans/ contains only 706 scenes for train and val. I think the problem is that there is a mismatch between the two.

So I copied all the files in directory data/scannet/scans_test/ to directory data/scannet/scans/. At this point, executing the command python batch_load_scannet_data.py can work normally.

I want to know, am I right in this way? Looking forward to your reply.

yvfengZhong avatar Sep 14 '21 03:09 yvfengZhong

I have noticed that your batch_load_scannet_data.py is modified from votenet, so I have checked votenet and found that it did use scannet_train.txt.

Uploading 屏幕快照 2021-09-20 下午11.03.54.png…

I have looked at many other codes that also use scannet_train.txt in batch_load_scannet_data.py. I think you may have made a mistake for some reason.

In addition, I found that your OBJ_CLASS_IDS is different from others. Why?

yvfengZhong avatar Sep 20 '21 15:09 yvfengZhong

Had this issue as well. It can be ignored for now. The behavior might change later. OBJ_CLASS_IDS is intentionally different.

ga92xug avatar May 27 '22 13:05 ga92xug

Actually, you can put all the scans including both train-val and test to the data-dir because the code will distinguish them automatically. If you process test scans, the code will just sampling 50k points and not process the label info because of the lack of manual annotations. As for the category of classes, this dataset just take part of the ScanNetv2 classes into consideration.

Coobiw avatar Dec 07 '22 17:12 Coobiw

So how should I deal with this issue? I'm right now so puzzled with this problem

cactusycy avatar Sep 08 '23 01:09 cactusycy

You can replace the "scannetv2.txt" with "scannetv2_trainval.txt".

yvfengZhong avatar Sep 12 '23 07:09 yvfengZhong