tangent_conv
tangent_conv copied to clipboard
Test on unclassified data sets
Hello,
Once a model has been trained, is there a way to apply to unclassified data? As far as I can tell, the configuration file does not differentiate the validation set (which needs to have a valid labeling) from the test set (which, in real life applications, could be an unclassified set). I have tried to include test sets with all labels equal to 0 (unclassified), but in that case precomputing the validation batches does not work.
Did I miss something or is it simply not possible, with the current framework, to classify data sets with unknown labeling?
Hi,
In the current version of the framework there is no 'proper' way to test on unlabeled data but I think the solution you describe should work. Could you please specify exactly what error you get when you try setting all labels to 0? I can also suggest trying to set them to 1 instead, because by default points with 0 labels correspond to the background class and may be ignored.
I don't really get any error when I set all labels to 0, the software just gets stuck for a very long time in the function precompute_validation_batches()
. If I understand correctly, there is at some point a search for a random point, and the random point is discarded if the label is 0: so with all labels to 0, I imagine it either goes to an infinite loop or to a very very long search that will never find anything.
If I put all labels to 1, am I correct that it also means that I should not use these labels in the validation set? Otherwise the training will wrongly consider these points as ground truth for label 1? Or is it working differently?
I see. I will update the code to support proper testing.
Sure, you should not use those labels in the validation set. Using unlabeled data for validation would not make sense anyway - you need ground truth there.