KPConv-PyTorch icon indicating copy to clipboard operation
KPConv-PyTorch copied to clipboard

ValueError: It seems this dataset only containes empty input spheres

Open Jaychouxq opened this issue 2 years ago • 17 comments

After training 40 epochs, error happens ValueError: It seems this dataset only containes empty input spheres

Jaychouxq avatar May 11 '22 16:05 Jaychouxq

This error is raised here: https://github.com/HuguesTHOMAS/KPConv-PyTorch/blob/3a774ff8d54a4d080fe65093b2299ede35d9735d/datasets/S3DIS.py#L344-L348

And only appear if the code encountered an input sphere with less than 2 points in it (so with only 1 or 0 point), more than 100 times.

If I were you, I would clean your dataset from isolated points before using it in the network to avoid this mistake. If you really don't want to clean your data, you can raise the number of "failed attempts" that raises the mistake

HuguesTHOMAS avatar May 11 '22 18:05 HuguesTHOMAS

Thank you. I know where the error occurred, but I don't know why it happened. Could you please explain the meaning of "clean your dataset from isolated points", I don't understand.

Jaychouxq avatar May 11 '22 18:05 Jaychouxq

To create a batch, the loop goes like this:

Repeat until total number of point in batch > batch_limit:
1. sample a point from the dataset 
2. find all the points in a sphere of radius `in_radius` (these are the input points)
3. compute everything we need with it (augmentation, etc.)
4. add this input sphere to the list of the batch point clouds

If the sphere contains only 1 point or maybe zero points, this safe-check ensures that we do not add it to the batch as it is a useless example (it happens between 2. and 3.). However, if this happens too often I decided to make it raise an Error so that a correction can be made. It should not be normal to encounter more than a hundred empty spheres before being able to even fill one batch

The reason you encounter so many empty spheres surely is that your dataset contains many points that are isolated from the other (further than in_radius to any other point). Maybe you chose in_radius too small? Or your data has too many outliers that should be cleaned

HuguesTHOMAS avatar May 11 '22 19:05 HuguesTHOMAS

To create a batch, the loop goes like this:

Repeat until total number of point in batch > batch_limit:
1. sample a point from the dataset 
2. find all the points in a sphere of radius `in_radius` (these are the input points)
3. compute everything we need with it (augmentation, etc.)
4. add this input sphere to the list of the batch point clouds

If the sphere contains only 1 point or maybe zero points, this safe-check ensures that we do not add it to the batch as it is a useless example (it happens between 2. and 3.). However, if this happens too often I decided to make it raise an Error so that a correction can be made. It should not be normal to encounter more than a hundred empty spheres before being able to even fill one batch

The reason you encounter so many empty spheres surely is that your dataset contains many points that are isolated from the other (further than to any other point). Maybe you chose too small? Or your data has too many outliers that should be cleanedin_radius``in_radius

To create a batch, the loop goes like this:

Repeat until total number of point in batch > batch_limit:
1. sample a point from the dataset 
2. find all the points in a sphere of radius `in_radius` (these are the input points)
3. compute everything we need with it (augmentation, etc.)
4. add this input sphere to the list of the batch point clouds

If the sphere contains only 1 point or maybe zero points, this safe-check ensures that we do not add it to the batch as it is a useless example (it happens between 2. and 3.). However, if this happens too often I decided to make it raise an Error so that a correction can be made. It should not be normal to encounter more than a hundred empty spheres before being able to even fill one batch

The reason you encounter so many empty spheres surely is that your dataset contains many points that are isolated from the other (further than to any other point). Maybe you chose too small? Or your data has too many outliers that should be cleanedin_radius``in_radius

OK, I understand, Thank you very much. But creating a batch sounds complicated, maybe I'll try raise the number of "failed attempts" that raises the mistake.

Jaychouxq avatar May 12 '22 02:05 Jaychouxq

This error is raised here:

https://github.com/HuguesTHOMAS/KPConv-PyTorch/blob/3a774ff8d54a4d080fe65093b2299ede35d9735d/datasets/S3DIS.py#L344-L348

And only appear if the code encountered an input sphere with less than 2 points in it (so with only 1 or 0 point), more than 100 times.

If I were you, I would clean your dataset from isolated points before using it in the network to avoid this mistake. If you really don't want to clean your data, you can raise the number of "failed attempts" that raises the mistake

I raise the number of "failed attempts" that raises the mistake to 1000, but it still only ran 30 epochs, I want to know where the sphere comes from, from original_ply folder or input_0.040 folder?Why do other people not make such mistakes when using the same data set? I still don't know how to clean your dataset from isolated points, Does this need to be programmed by myself?

Jaychouxq avatar May 12 '22 10:05 Jaychouxq

Are you using S3DIS or your own dataset?

HuguesTHOMAS avatar May 12 '22 11:05 HuguesTHOMAS

Are you using S3DIS or your own dataset?

S3DIS

Jaychouxq avatar May 12 '22 11:05 Jaychouxq

Oh ok, I did not understand. I thought you were using your own dataset. In that case, there might be a problem with the dataset.

I would suggest trying to delete the whole dataset and download it again from scratch... Make sure you use the Stanford3dDataset_v1.2.zip file and not the other one.

HuguesTHOMAS avatar May 12 '22 11:05 HuguesTHOMAS

Oh ok, I did not understand. I thought you were using your own dataset. In that case, there might be a problem with the dataset.

I would suggest trying to delete the whole dataset and download it again from scratch... Make sure you use the Stanford3dDataset_v1.2.zip file and not the other one.

OK, Thank you, I'll try it.

Jaychouxq avatar May 12 '22 11:05 Jaychouxq

Oh ok, I did not understand. I thought you were using your own dataset. In that case, there might be a problem with the dataset.

I would suggest trying to delete the whole dataset and download it again from scratch... Make sure you use the Stanford3dDataset_v1.2.zip file and not the other one.

I have downloaded Stanford3dDataset_v1.2.zip again, but this error still exists, I think it's really strange.

Jaychouxq avatar May 17 '22 15:05 Jaychouxq

Could you share the .ply files in your input_0.040 folder? It is where the spheres come from:

  • Area_1.ply
  • Area_2.ply
  • Area_3.ply
  • Area_4.ply
  • Area_5.ply
  • Area_6.ply

And also the files:

  • Area_1_coarse_KDTree.pkl
  • Area_2_coarse_KDTree.pkl
  • Area_3_coarse_KDTree.pkl
  • Area_4_coarse_KDTree.pkl
  • Area_5_coarse_KDTree.pkl
  • Area_6_coarse_KDTree.pkl

Which are the files used to sample the sphere centers.

I could help to see if there is a problem with them.

HuguesTHOMAS avatar May 18 '22 11:05 HuguesTHOMAS

Could you share the .ply files in your input_0.040 folder? It is where the spheres come from:

* Area_1.ply

* Area_2.ply

* Area_3.ply

* Area_4.ply

* Area_5.ply

* Area_6.ply

And also the files:

* Area_1_coarse_KDTree.pkl

* Area_2_coarse_KDTree.pkl

* Area_3_coarse_KDTree.pkl

* Area_4_coarse_KDTree.pkl

* Area_5_coarse_KDTree.pkl

* Area_6_coarse_KDTree.pkl

Which are the files used to sample the sphere centers.

I could help to see if there is a problem with them.

I have uploaded input_0.030 folder to this link: https://drive.google.com/drive/folders/1KEk7zCXBXl7_RiJLMBPZ5aeMkmv5eq6Y?usp=sharing I found that Area_5 had _ proj.pkl file, other areas do not have, why is it that.

Jaychouxq avatar May 18 '22 13:05 Jaychouxq

_ proj.pkl is for testing purposes. And Area_5 is the test area.

I honestly have no idea why your code is doing this. At this point I see only two solutions.

  1. Erase everything and start over by cloning again the github repo from scratch.
  2. Just comment the lines https://github.com/HuguesTHOMAS/KPConv-PyTorch/blob/3a774ff8d54a4d080fe65093b2299ede35d9735d/datasets/S3DIS.py#L344-L351 (But that would only be hiding the problem)

HuguesTHOMAS avatar May 18 '22 15:05 HuguesTHOMAS

OK, thanks a lot. I will try.

Jaychouxq avatar May 19 '22 02:05 Jaychouxq

I still don't know the cause of this problem, so I switched to Linux and succeeded

Jaychouxq avatar May 31 '22 07:05 Jaychouxq

This is very strange, maybe some weird stuff happening because you were not on linux. But I am glad you succeeded with Linux

HuguesTHOMAS avatar May 31 '22 17:05 HuguesTHOMAS

i also confront with this problem when using Toronro3D @HuguesTHOMAS should i also try what you mentioned above?

File ~/KPConv-PyTorch/datasets/Toronto3D.py:354, in Toronto3DDataset.potential_item(self, batch_i, debug_workers) 352 failed_attempts += 1 353 if failed_attempts > 100 * self.config.batch_num: --> 354 raise ValueError('It seems this dataset only containes empty input spheres') 355 t += [time.time()] 356 t += [time.time()]

ValueError: It seems this dataset only containes empty input spheres

YdeZ030 avatar Oct 25 '23 17:10 YdeZ030