PatchCore_anomaly_detection
PatchCore_anomaly_detection copied to clipboard
Thanks for your work. But it's a bit strange here in the coreset sampling part
https://github.com/hcw-00/PatchCore_anomaly_detection/blob/59b67d9f8dd7a7c9e75203ca57e01cedbb1807ba/sampling_methods/kcenter_greedy.py#L104-L120 I think the wanted logic here should be initialing a centre point by random choice (the first if-branch) and all the entire centre points are chosen by argmax the distance (the second if-branch). But for the "self.already_selected" is initialized as an empty list in line 49(not None), so we'll never get into the first if-branch. As for the centre point initializing process, the expression "np.argmax(self.min_distances)" will return 0 for "np.argmax(None)" will return 0. So, as a result, the program selects index-0-feature as algorithm initialization everytime instead of random selecting one as a common practice.
I agree with your point. I think the first if-branch should be modify into "if not self.already_selected: " for more accurate to the kcenter_greedy algorithm. Have you modify the code and test the difference ?
Yeah I modifyed the code like the following:
for _ in range(N):
if not self.already_selected:
# Initialize centers with a randomly selected datapoint
ind = np.random.choice(np.arange(self.n_obs))
else:
ind = np.argmax(self.min_distances)
assert ind not in self.already_selected
self.update_distances([ind], only_new=True, reset_dist=False)
new_batch.append(ind)
print("Maximum distance from cluster centers is %0.2f" % max(self.min_distances))
self.already_selected.append(ind)
I didn't test the difference with thorough experiments. I conjectured that there will not be significant difference between the two implementation but our modification is more up to standard and more consistent with the intention of the algorithm
This point makes total sense. I also modified the code but haven't thoroughly tested.