Transfer learning

Open alex-dixon opened this issue 6 years ago • 1 comments

I've been trying to implement transfer learning with MaxentTagger. The basic approach I've taken so far adds a new method that takes a trainFile, the existing model's configuration, and the output path for the new model. It loads the existing model into a new MaxentTagger instance with readModelAndInit, changes the model field to the new output path, and starts training the same as trainAndSaveModel.

Expecting this to be enough seems increasingly naive. Early on, I was able to output and load a model trained in this way but ran into exceptions when running a parse. I've done a couple passes trying to merge tags and dict produced by loading the existing model and preparing the training run for the input dataset. If this is the correct approach I'm thinking it'd need to be done in some way for MaxentTagger.lambda, the data held by TaggerFeatures and TaggerExperiments, and a few other things. That may require access to a fair number of fields that are currently marked as private.

Is there any interest from your side on adding this as a feature? Should I even think this is possible to do? Any general advice?

Jul 13 '19 18:07 alex-dixon

It sounds like you might have better luck coding your new training method into MaxentTagger and recompiling that. Another alternative would be recompiling with those methods as package private (default private level) and then subclassing MaxentTagger.

Either way, I would say that in general we would be interested in hearing about methods for training a new tagger from an existing tagger,

John

On Sat, Jul 13, 2019 at 11:30 AM Alex Dixon [email protected] wrote:

I've been trying to implement transfer learning with MaxentTagger. The basic approach I've taken so far adds a new method that takes a trainFile, the existing model's configuration, and the output path for the new model. It loads the existing model into a new MaxentTagger instance with readModelAndInit, changes the model field to the new output path, and starts training the same as trainAndSaveModel.

Expecting this to be enough seems increasingly naive. Early on, I was able to output and load a model trained in this way but ran into exceptions when running a parse. I've done a couple passes trying to merge tags and dict produced by loading the existing model and preparing the training run for the input dataset. If this is the correct approach I'm thinking it'd need to be done in some way for MaxentTagger.lambda, the data held by TaggerFeatures and TaggerExperiments, and a few other things. That may require access to a fair number of fields that are currently marked as private.

Is there any interest from your side on adding this as a feature? Should I even think this is possible to do? Any general advice?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/912?email_source=notifications&email_token=AA2AYWOMZO5FZPZD7K3NGLTP7INNBA5CNFSM4IDAOM7KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G7BK2BA, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2AYWOFCGJOCNBBQ7FVHFTP7INNBANCNFSM4IDAOM7A .

Jul 13 '19 18:07 AngledLuffa