vowpal_wabbit
vowpal_wabbit copied to clipboard
Multiclass Classifier Consumes Large Memory
Describe the bug
After training a multiclass classifier, it produced a model with file size of 78 MB. Later, when the model is loaded for testing and predictions, the model consumes around 1GB of memory.
The large memory behaviour happens for both the python VW daemon. It happens only with multiclass models even if the number of classes is just 2, binary scalar models do not exhibit the same behaviour.
How to reproduce
Used this command line
vw -d train.vw -f model.model -c --holdout_after 671358 --oaa 2 --probabilities --sgd -b 28 --decay_learning_rate 0.960291948391061 -l 0.058964302633529454 --l1 7.3193537905481405e-06 --l2 1.472820769966616e-07 --loss_function logistic --passes 60 --power_t 0.013904659435534318 --random_seed 17
Number of examples 600K
Version
9.8.0
OS
Linux
Language
Python, CLI
Additional context
No response
Can you quantify how much memory the model is using as a function of the number of classes?
If only a small number of the parameters are non zero you can use the sparse representation (--sparse_weights).
Closing for now, but reopen if you want to pursue.