sacremoses icon indicating copy to clipboard operation
sacremoses copied to clipboard

Possible to retrain/keep training an existing model?

Open petulla opened this issue 5 years ago • 5 comments

Hi

Given a loaded model, is it possible to train it with more data?

petulla avatar Jun 22 '20 21:06 petulla

Which model? Do you mean the truecasing model? Other than that, there's no real model training in sacremoses , it's lots of regex rules writing and testings =)

alvations avatar Jun 23 '20 02:06 alvations

I meant a model already trained with sacremoses.. in other words, can you load an existing model and keep training (add more rules).

petulla avatar Jun 23 '20 13:06 petulla

May I ask which preprocessing task are you referring to in sacremoses? The truecaser?

For other tasks, there's no training involved and the rules are manually defined 😅

alvations avatar Jun 23 '20 15:06 alvations

yep truecaser.

to clarify

let's say i load some text into sacremoses, i train for truecasing.

then two days later, i have some new text. i want to update the model.

i want to keep training the existing model with new text rather than start from scratch.

petulla avatar Jun 23 '20 15:06 petulla

P/S: I'm thinking about how to put this feature in. It's not hard but just have to think a little about the user's usage logic =)

I'm a little busy these couple of days. But please keep this issue open, I'll look into it because I think it's worth a try.

alvations avatar Jul 09 '20 04:07 alvations