vecto icon indicating copy to clipboard operation
vecto copied to clipboard

Unstable results with LRCos

Open ekaf opened this issue 5 years ago • 2 comments

Re-running benchmarks of LRCos produces differences of 1 or 2 correct answers out of the 50 questions in some individual BATS tests, so that the benchmark accuracy varies by up to 4 per cent on the same data, which is a concern for the reproduciblity of experiments. The differences tend to average out across different test categories, so the overall percentage difference is smaller, though still problematic.

It seems plausible that the randomization used in the LogisticRegression from sklearn.linear_model could cause this problem. But seeding the random number generator with np.random.seed(1), random.seed(1), or calling LogisticRegression with random_state=1 does not help.

ekaf avatar Jun 24 '19 09:06 ekaf

This is most likely because of random choice of negative samples for classifier. Not sure why seeding random number generator does not fix it, but I would not even call it a "fix", but rather hiding the fact that there's some inherent randomness.

Good solution would be to have some deterministic algorithm for choosing negative samples, ideally one that yields better results than random choice.

undertherain avatar Jun 24 '19 10:06 undertherain

Yes, a random.seed(1) call at the top of solvers.py stabilizes LRCos, but does not guarantee fair comparisons, since any arbitrary choice of seed can be fortunate for one embedding, and unlucky for another. Deterministic negative samples would certainly be more fair, and while waiting for a better solution, I consider simply dropping the random noise in gen_vec_single. This makes comparisons both fair and reproducible, though it hurts overall performance a little.

The reason why seeding the random number generator did not work for me before was that I use set.intersection() for vocabulary filtering, which is much quicker than the for loop in vocabulary._populate_from_source_and_wordlist(self, source, wordlist). But the problem was that the word ordering of sets is non-deterministic. Sorting lst_words made the order deterministic again: self.lst_words=sorted(list(set(source.lst_words).intersection(set(wordlist))))

ekaf avatar Jun 25 '19 06:06 ekaf