vowpal_wabbit icon indicating copy to clipboard operation
vowpal_wabbit copied to clipboard

Quantile Loss for support vector regression or at least for usual linear regression

Open Sandy4321 opened this issue 4 years ago • 16 comments

Description

A brief description of the error, missing documentation or what you would like added it is not clear how to find explanations/ example how to code in python Quantile Loss for support vector regression or at least for usual linear regression

Link to Documentation Page

Where is the documentation in question? https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Loss-functions

Sandy4321 avatar Jun 11 '21 15:06 Sandy4321

Hi @Sandy4321 thanks for filing this issue. Could you expand a bit on what you would like to see here? Is the question about how to enable "quantile" loss when using VW in Python, or is it something else?

lokitoth avatar Jun 11 '21 17:06 lokitoth

I ask description and more details how to do Quantile Loss for support vector regression or at least for usual linear regression

or at least code example for python pls

Sandy4321 avatar Jun 13 '21 16:06 Sandy4321

Sorry, I am still a bit confused about the specific question here. Switching the loss function to "quantile" (or others) in Python is done the same way as setting any command-line argument:

model = pyvw.vw(loss_function="quantile")

Is the question about better documentation for how to configure various vw options in Python? Or is it about how to think about Quantile Regression in general?

lokitoth avatar Jun 15 '21 16:06 lokitoth

yes the question is better documentation for how to configure various vw options in Python?

it would be great to have full example from start to end for python quantile regression for example given such data file python code to use is :.....

predicted data is:....

mean absolute error is .... confidence intervals are: ....

since always something is not clear in general form description

some efforts done in this direction for example https://vowpalwabbit.org/docs/vowpal_wabbit/python/latest/vowpalwabbit.pyvw.html

from vowpalwabbit import pyvw vw1 = pyvw.vw('--audit') vw2 = pyvw.vw(audit=True, b=24, k=True, c=True, l2=0.001) vw3 = pyvw.vw("--audit", b=26) vw4 = pyvw.vw(q=["ab", "ac"])

but it would be really great to have full python code example

thanks a lot for taking care

Sandy4321 avatar Jun 15 '21 18:06 Sandy4321

I was able to find very limited examples in python this one https://vowpalwabbit.org/tutorials/python_first_steps.html is very concise

Sandy4321 avatar Jun 15 '21 18:06 Sandy4321

at least something like this , but for quantile regression

https://pypi.org/project/vowpalwabbit/

import numpy as np from sklearn import datasets from sklearn.model_selection import train_test_split from vowpalwabbit.sklearn_vw import VWClassifier

generate some data

X, y = datasets.make_hastie_10_2(n_samples=10000, random_state=1) X = X.astype(np.float32)

split train and test set

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=256)

build model

model = VWClassifier() model.fit(X_train, y_train)

predict model

y_pred = model.predict(X_test)

evaluate model

model.score(X_train, y_train) model.score(X_test, y_test)

Sandy4321 avatar Jun 15 '21 18:06 Sandy4321

by the way in this link https://pypi.org/project/vowpalwabbit/ there is line at the bottom python/examples : example python code and jupyter notebooks to demonstrate functionality

may you clarify how to find this folder?

Sandy4321 avatar Jun 15 '21 18:06 Sandy4321

all right
I found this folder

then it would be great to share example for quantile regression in this style https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/python/examples/poisson_regression.ipynb

Sandy4321 avatar Jun 15 '21 18:06 Sandy4321

Yes, it is not entirely clear. It is referring to the dirs on that same location as that text file in the repository, which makes it extra confusing on that pypy.org documentation. For a slightly better experience, you can see those docs over here: https://github.com/VowpalWabbit/vowpal_wabbit/tree/master/python

The readme is in vowpal_wabbit/python/README.rst https://github.com/VowpalWabbit/vowpal_wabbit/tree/master/python

The python/examples would be in vowpal_wabbit/python/examples https://github.com/VowpalWabbit/vowpal_wabbit/tree/master/python/examples

Tests: https://github.com/VowpalWabbit/vowpal_wabbit/tree/master/python/tests

lalo avatar Jun 15 '21 18:06 lalo

We also have these autogen docs: https://vowpalwabbit.org/docs/ https://vowpalwabbit.org/docs/vowpal_wabbit/python/latest/

lalo avatar Jun 15 '21 18:06 lalo

In the SciKit case, the configuration options are passed in the same way as for pyvw:

So, if you want to have VWClassifier run with quantile loss, you would specify:

classifier_model = VWClassifier(loss_function='quantile')

#or

regressor_model = VWRegressor(loss_function='quantile')

Here is a deep link for VWClassifier and one for VWRegressor to the class documentation

I suspect that we probably will not make a specific tutorial for just quantile loss because it seems like there would be a lot of tutorials that only differ from one-another by the specific combination of options they use. Would a general tutorial about how to pass options to VW when using in Python in pyvw / scikit modes make sense here @Sandy4321, or alternatively a tutorial that explores the various things you can do in the context of regression specifically?

lokitoth avatar Jun 15 '21 19:06 lokitoth

in this link https://vowpalwabbit.org/docs/vowpal_wabbit/python/latest/vowpalwabbit.sklearn.html#vowpalwabbit.sklearn_vw.VWRegressor I see no example though there is example for classifier

Sandy4321 avatar Jun 15 '21 19:06 Sandy4321

Would a general tutorial about how to pass options to VW when using in Python in pyvw / scikit modes make sense here @Sandy4321, or alternatively a tutorial that explores the various things you can do in the context of regression specifically?

yes would be great to have one

for example in https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/python/tests/test_sklearn_vw.py

def test_lrq(self):
    X = ['1 |user A |movie 1',
         '2 |user B |movie 2',
         '3 |user C |movie 3',
         '4 |user D |movie 4',
         '5 |user E |movie 1']
    model = VW(convert_to_vw=False, lrq='um4', lrqdropout=True, loss_function='quantile')
    assert getattr(model, 'lrq') == 'um4'
    assert getattr(model, 'lrqdropout')
    model.fit(X)
    prediction = model.predict([' |user C |movie 1'])
    assert np.allclose(prediction, [3.], atol=1)

it is not clear at all about lrq='um4' why um4 , what is it um4 and how to find answer on this kind of questions for people who is not familiar with VW but only starting to learn VW

it is difficult to make google search for meaning for lrq since it is only 3 letters

Sandy4321 avatar Jun 15 '21 19:06 Sandy4321

even stackoverflow can not help https://stackoverflow.com/questions/44298795/one-time-vs-iteration-model-in-vowpal-wabbit-with-lrq-option

Sandy4321 avatar Jun 15 '21 19:06 Sandy4321

why um4 , what is it um4 and how to find answer on this kind of questions for people who is not familiar with VW but only starting to learn VW

The command-line options documentation is fairly sparse here, but here are some links to get you started with LRQ:

Putting together a more coherent list of issues that can be addressed from this:

  • [ ] Tutorial/documentation focusing on configuring VW under the Python interface (and how things map from the command-line arguments wiki page)
  • [ ] Index of samples/example code/in-repo tutorials, ideally in a way that people can browse by "toolkit part" (we need a better term for this, but generally some way to get to a sample from a given portion of the command line argument)

lokitoth avatar Jun 16 '21 14:06 lokitoth

great thanks for help

then lrq='um4' is not related to loss_function='quantile' in model = VW(convert_to_vw=False, lrq='um4', lrqdropout=True, loss_function='quantile')

my guess also regularization or L1 or L2 may be added to this line model = VW(convert_to_vw=False, lrq='um4', lrqdropout=True, loss_function='quantile') ? similar to --l2 use in

``

@${VW} --loss_function quantile -l 0.1 -b 24 --passes 100 \ -k --cache_file [email protected] -d $(word 2,$+) --holdout_off \ --power_t 0.333 --l2 1.25e-7 --lrq um7 --adaptive --invariant -f [email protected]

``

In general VW is really great package !!! but for python users would be crucial to have examples from start to end coded in python starting from reading data from file and ending by performance quality demonstration

since for python coder understanding make file like https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/demo/movielens/Makefile is impossible to do task..

Sandy4321 avatar Jun 16 '21 21:06 Sandy4321