PatchCore_anomaly_detection icon indicating copy to clipboard operation
PatchCore_anomaly_detection copied to clipboard

Thanks for your work. But it's a bit strange here in the coreset sampling part

Open mengxianghan123 opened this issue 3 years ago • 3 comments

https://github.com/hcw-00/PatchCore_anomaly_detection/blob/59b67d9f8dd7a7c9e75203ca57e01cedbb1807ba/sampling_methods/kcenter_greedy.py#L104-L120 I think the wanted logic here should be initialing a centre point by random choice (the first if-branch) and all the entire centre points are chosen by argmax the distance (the second if-branch). But for the "self.already_selected" is initialized as an empty list in line 49(not None), so we'll never get into the first if-branch. As for the centre point initializing process, the expression "np.argmax(self.min_distances)" will return 0 for "np.argmax(None)" will return 0. So, as a result, the program selects index-0-feature as algorithm initialization everytime instead of random selecting one as a common practice.

mengxianghan123 avatar Sep 28 '21 13:09 mengxianghan123

I agree with your point. I think the first if-branch should be modify into "if not self.already_selected: " for more accurate to the kcenter_greedy algorithm. Have you modify the code and test the difference ?

JefferyChiang avatar Oct 04 '21 01:10 JefferyChiang

Yeah I modifyed the code like the following:

for _ in range(N):
            if not self.already_selected:
                # Initialize centers with a randomly selected datapoint
                ind = np.random.choice(np.arange(self.n_obs))
            else:
                ind = np.argmax(self.min_distances)
           
            assert ind not in self.already_selected

            self.update_distances([ind], only_new=True, reset_dist=False)
            new_batch.append(ind)
            print("Maximum distance from cluster centers is %0.2f" % max(self.min_distances))

            self.already_selected.append(ind)

I didn't test the difference with thorough experiments. I conjectured that there will not be significant difference between the two implementation but our modification is more up to standard and more consistent with the intention of the algorithm

mengxianghan123 avatar Oct 09 '21 01:10 mengxianghan123

This point makes total sense. I also modified the code but haven't thoroughly tested.

HoseinHashemi avatar Jan 05 '22 05:01 HoseinHashemi