order-embeddings-wordnet Why are there false positives with transitive closure ?

Hi,

I am working with the wordnet dataset from this repo and I noticed something odd with the data.

Fitting the transitive closure baseline on the training data dataset/contrastive_trans.t7 also results in false negatives on the test data!! My understanding is that the wordnet train-test datasets were generated by just splitting the transitive closure of WordNet so the transitive closure baseline should never generate any false positives.

As far as I can tell the training data does not contain any noise in it so false positives should not occur by the closure of the training data. Can you give some reasons for why the transitive closure of training data might generate false positives?

Dec 05 '18 00:12 se4u

I'm getting the same issue; for example, contrastive_trans.t7 says that (adventure.n.01, cognition.n.01) is a valid hypernym pair which makes no sense. It also says that (ballplayer.n.01, wrongdoer.n.01) is a valid hypernym pair which I really hope isn't true because I used to be a ballplayer.

Dec 28 '18 05:12 KevLuo

In case this is helpful to anyone, a few things I noticed:

a) Lua is 1 indexed, so if using python definitely need to offset indices by 1 to match with correct labels. b) In the data, hypo and hyper attributes have negative samples as well, as shown from the targets field. The hypernyms field contains the actual hypo and hyper but as 2 columns stacked together.

May 15 '19 11:05 andreasgrv

I believe that part a) of what @andreasgrv mentioned was the source of the errors I was getting. I had a brief correspondence with one of the authors of the paper and he suggested the same possibility as well.

May 15 '19 22:05 KevLuo

I also noticed the 1-based indexing in the createDatasets.lua.

@andreasgrv what do mean by hypo and hyper attributes have negative samples as well as shown from the targets field. The hypernyms field contains the actual hypo and hyper but as 2 columns stacked together.. Can you please explain? \cc @ivendrov

May 17 '19 19:05 nandana

@nandana Below code demonstrates what I mean, has a dependency on torchfile

import torchfile


if __name__ == "__main__":

    tf = torchfile.load('dataset/contrastive_trans.t7')

    for part in ['train', 'val', 'test']:
        print('========== %s shapes ===========' % part)
        print(tf[part]['hypernyms'].shape)
        # Below also contain negative samples as explained in the paper
        print(tf[part]['hypo'].shape)
        print(tf[part]['hyper'].shape)
        print(tf[part]['target'].shape)

May 17 '19 19:05 andreasgrv

got it, thanks a lot @andreasgrv! This really helped to explore the generated datasets!

May 17 '19 22:05 nandana

order-embeddings-wordnet order-embeddings-wordnet copied to clipboard

Why are there false positives with transitive closure ?

order-embeddings-wordnet
order-embeddings-wordnet copied to clipboard