Annif-tutorial icon indicating copy to clipboard operation
Annif-tutorial copied to clipboard

Exercise about sufficient amount of train data (learning curves)

Open juhoinkinen opened this issue 3 years ago • 1 comments

A common question in the tutorial sessions has been "how many documents do I need for training a model". We could have an optional exercise that would show how increasing --docs-limit value in training a model affects the evaluation results of the model. Also some simple way to plot the results as a learning curve would be nice.

juhoinkinen avatar Jul 06 '22 09:07 juhoinkinen

As a first step I added an extra section to the MLLM exercise: https://github.com/NatLibFi/Annif-tutorial/blob/master/exercises/05_mllm_project.md#extra-experiment-with-different-amounts-of-training-data

osma avatar Aug 23 '22 13:08 osma