skip-thoughts icon indicating copy to clipboard operation
skip-thoughts copied to clipboard

Conceptual error: Independent dictionaries are used in eval_trec.py for Train and Test data

Open niravbhan opened this issue 6 years ago • 0 comments

There is a logical error in the file eval_trec.py, which evaluates skipthoughts on classification into 1 of 6 different question-types. The program uses a label-to-number mapping for classification, however these mappings (i.e. dictionaries) are generated fully independently while training and testing. To be precise, the below lines of code are executed for both training and testing:


d = {} count = 0 setlabels = set(labels) for w in setlabels: d[w] = count count += 1 idxlabels = np.array([d[w] for w in labels])

This means:

  1. If the test set has a different set of labels than the training set, because one or more labels is not present, the program will almost certainly fail. It can even lead to 0% classification accuracy.
  2. If the test set has the full set of labels, it turns out that the program works due to a sheer coincidence - python has a built-in, deterministic ordering of elements in sets, which makes the training and test dictionaries coincide. However, this is not expected behaviour from the 'set' data structure, so it is poor programming practice to rely on this.

FIX: The dictionary should be learnt only once, during training, and re-used for testing. The dictionary ought to be treated as a learned parameter, along with the Logistic Regression coefficients.

niravbhan avatar Jun 25 '18 23:06 niravbhan