ludwig icon indicating copy to clipboard operation
ludwig copied to clipboard

support Active Learning procedure?

Open ericxsun opened this issue 1 year ago • 2 comments

Is your feature request related to a problem? Please describe.

Is it possible do active learning based on the current master branch? Any clue will be highly appreciated.

ericxsun avatar Jul 14 '22 14:07 ericxsun

Hi @ericxsun , thanks for asking this question. Ludwig has a mechanism to allow training of models on batches of data incrementally. We call it train_online and you can see it documented here https://ludwig-ai.github.io/ludwig-docs/0.5/user_guide/api/LudwigModel/#train_online . The difference with the redular train is that it runs on the batch of data provided only once. This makes it useful for implementing an active learning loop that may look like:

model = LudwigModel(...)
model.train(my_data)
for i in active_loop_steps:
  new_data = get_new_data()
  predictions = model.predict(new_data)
  most_valuable_data = active_learning(new_data, predictions)  # need to implement it yourself or use an active learning library for this
  model.train_online(most_valuable_data)

One may want to also train on old datapoints to avoid catastrophic forgetting, and also maybe manipulate the learning rate and some other hyperparameters for finetuning purposes when using train_online, but this is the sketch on how you can use it.

Does this help?

w4nderlust avatar Jul 21 '22 20:07 w4nderlust

That's awesome. Thanks a lot. I'll try it.

ericxsun avatar Jul 22 '22 01:07 ericxsun