uda releasing other text classification models, datasets & unlabeled corpus

releasing other text classification models, datasets & unlabeled corpus

Open Atomu2014 opened this issue 6 years ago • 4 comments

Hi, thanks for releasing the paper & code.

I have tried the IMDb text classification task and UDA achieved quite promising improvements. Will you release your models and datasets for the other text classification tasks, especially the unlabeled corpus? So that your work will be easier to follow and have a larger impact.

Thanks

Sep 18 '19 20:09 Atomu2014

Hi, you can directly use the current code for other datasets and we used similar hyperparameters for them. You can get the supervised data for other datasets from here. The labeled examples for semi-supervised learning are chosen randomly. Here is the unsupervised data for Yelp and Amazon.

Sep 19 '19 15:09 michaelpulsewidth

Thanks! That's awesome!

Sep 19 '19 18:09 Atomu2014

Thanks for your reply. Another little question. How many unlabeled data did you use for Yelp and Amazon reviews? I suppose that you used all unlabeled reviews according to your description in the paper.

Thanks!

Sep 21 '19 08:09 chencjGene

Yes, we used all unlabeled reviews. Sorry for the late reply!

Sep 26 '19 04:09 michaelpulsewidth

uda uda copied to clipboard

releasing other text classification models, datasets & unlabeled corpus

uda
uda copied to clipboard