COSINE icon indicating copy to clipboard operation
COSINE copied to clipboard

How to train a semi-supervised learning

Open chiyuzhang94 opened this issue 2 years ago • 7 comments

Hi @yueyu1030 ,

Thanks for your great work.

I wonder how to use your code on semi-supervised learning. I want to provide a small amount of gold data and a large amount of unlabelled data and use COSINE for semi-supervised learning. Which method should I use? clean or selftrain? How should I set up the training data?

Thanks. Chiyu

chiyuzhang94 avatar Dec 22 '21 01:12 chiyuzhang94

Hi Chiyu,

To adapt COSINE for semi-supervised learning, I suggest you first train your model with clean labels (the same as the 'init' step in the paper), then use self-training to bootstrap over all data (labeled + unlabeled).

Best, Yue

yueyu1030 avatar Dec 23 '21 07:12 yueyu1030

Hi @yueyu1030 ,

Thanks for your reply. After training on clean data, do I need to apply it to unlabelled data or save predictions to another file? I saw your unlabelled samples contain labels.

Best, Chiyu

chiyuzhang94 avatar Dec 23 '21 08:12 chiyuzhang94

Hi Chiyu,

Yes, you need to save the pre-trained model training on clean data, which serves as the initialization for self-training.

For unlabeled data, here we keep the labels to make the format of labeled & unlabeled data the same, but we never use them in training.

Best, Yue

yueyu1030 avatar Dec 27 '21 03:12 yueyu1030

Thanks, @yueyu1030 !

I have another question. Could you tell me hyperparameters used for semi-supervised learning? Leaning rate, batch size, a number of training epochs/steps? I do not find in the paper.

Thanks!

chiyuzhang94 avatar Dec 28 '21 20:12 chiyuzhang94

I also wonder about the data loader when the training mode is "clean".

In trainer script, I think we should only use the gold data when the training mode is "clean". But this line combines gold data and unlabelled data. Could you check this?

Thank. Chiyu

chiyuzhang94 avatar Dec 29 '21 02:12 chiyuzhang94

Hi,

For parameter setting, since COSINE is not mainly designed for semi-supervised learning, I haven't tuned the hyperparameters carefully. Overall, I believe the learning rate for fine-tuning can be searched from {1e-5, 2e-5, 5e-5} and {1e-6, 5e-6, 1e-5} for self-training, batch size for labeled data can be {4, 8, 16} and the number of epochs could be around 10 for initialization and {1000, 2000, 3000} steps for self-training.

For training with clean labeled data, we need to use both labeled data and unlabeled data (since this mode is mainly used for estimating the model's performance with full clean data). So we need to combine the gold data and unlabelled data.

I am traveling so the response might be a little late. Sorry for that and feel free to ask me other questions wrt. experiments.

Best, Yue

yueyu1030 avatar Dec 30 '21 08:12 yueyu1030

Hi,

Thanks. Hope you have a nice trip.

I still cannot get the point of why you combine the gold data and unlabelled data when you train on clean mode. I think you train a model with gold data and evaluate on gold data initially, and then do self-training that uses both the gold data and unlabelled data.

Another question: When your training mode is "selftrain", I find you already had two steps (initial training and self-training )in your code. I guess that I don't need to have a separate step to train an initial model with gold data and that I can just give the gold data as training data and train on "selftrain" mode. What do you think?

Best, Chiyu

chiyuzhang94 avatar Dec 31 '21 00:12 chiyuzhang94