vowpal_wabbit
vowpal_wabbit copied to clipboard
Allow the user to pass a separate file for validation (rather than using heldout parameter)
Short description
Allow vw command line to accept a heldout input file during training rather than sampling the training data.
How this suggestion will help you/others
Suppose you have a training data file with labels ordered by time. In some case, datapoint related to the same user can be followed in each example. This makes the validation process useless during training, since there is high chance that the same user samples "fall" in the heldout data. By allowing the user to specify an external file in the heldout process, the user could have much control over the training process.
@JohnLangford correct me if I'm wrong but I believe the standard way to achieve this would be to:
- Train and save a model with your input file using
--holdout_offand--final_regressor out.vw - Load this model and process your test set in test only mode
--initial_regressor out.vw --testonly
Thanks for the answer,
The point is that I would like to have information regarding the performance achieved during training (in a multipass stage) rather than at the of the training. Seems that the option --save_per_pass allows me to achieve what I need, since I can test the performance on the testset on each pass of the model.
Closing this at it is currently not on our roadmap, feel free to reopen