cogcomp-nlp icon indicating copy to clipboard operation
cogcomp-nlp copied to clipboard

Replicate the experiments in "Importance of Semantic Representation: Dataless Classification"

Open ZeweiChu opened this issue 6 years ago • 1 comments

I am trying to replicate the experiments in the paper. Importance of Semantic Representation: Dataless Classification https://cogcomp.org/page/publication_view/178 However, I cannot find the exact definition of the experiment "binary classification with Yahoo Answers dataset". I wonder if the author of this github repository could help to clarify this.

ZeweiChu avatar Mar 19 '18 03:03 ZeweiChu

On the 2nd page of the paper, there's a description of the Yahoo Answers Dataset. The corresponding experimental setup is outlined on the 3rd page.

The dataset is available here: https://cogcomp.org/page/resource_view/89

From the paper itself:

For the Yahoo! Answers dataset, we generated 20 random binary classification problems at the subcategory level. Some of these problems are shown in Table 4.

From table 4, such binary classification problems will look like:

  • Health Diet Fitness vs Health Allergies
  • Sports Mexican Football Soccer vs Social Science Dream Interpretation
  • Health Injuries vs Sports Brazilian Football Soccer
  • ..

If you can be a bit more specific about your confusion/question, I can try to address it.

shatu avatar Mar 29 '18 17:03 shatu