uda icon indicating copy to clipboard operation
uda copied to clipboard

releasing other text classification models, datasets & unlabeled corpus

Open Atomu2014 opened this issue 4 years ago • 4 comments

Hi, thanks for releasing the paper & code.

I have tried the IMDb text classification task and UDA achieved quite promising improvements. Will you release your models and datasets for the other text classification tasks, especially the unlabeled corpus? So that your work will be easier to follow and have a larger impact.

Thanks

Atomu2014 avatar Sep 18 '19 20:09 Atomu2014

Hi, you can directly use the current code for other datasets and we used similar hyperparameters for them. You can get the supervised data for other datasets from here. The labeled examples for semi-supervised learning are chosen randomly. Here is the unsupervised data for Yelp and Amazon.

michaelpulsewidth avatar Sep 19 '19 15:09 michaelpulsewidth

Thanks! That's awesome!

Atomu2014 avatar Sep 19 '19 18:09 Atomu2014

Thanks for your reply. Another little question. How many unlabeled data did you use for Yelp and Amazon reviews? I suppose that you used all unlabeled reviews according to your description in the paper.

Thanks!

chencjGene avatar Sep 21 '19 08:09 chencjGene

Yes, we used all unlabeled reviews. Sorry for the late reply!

michaelpulsewidth avatar Sep 26 '19 04:09 michaelpulsewidth