xlearn
xlearn copied to clipboard
Weighted data support
Hello. Thank for such a great library! Are you going to add weighted input for training?
@timofeev1995 Hi, I think that weighted input is not very hard to be added in xLearn. Could you please give me a more detailed description of this feature?
Thank for your response! Imagine, that you have data for CTR prediction problem like:
feature1 | feature2 | ... | clicks | impressions |
---|
You may have a hundreds and thousands (or even more) of impressions for certain feature set (for example, feature1 = 0, feature2 = 1 ... etc). So if you want to use FFM with libffm data format, you have to transform one row described above into thousand of identical lines like
feature1 | feature2 | ... | clicks | impressions |
---|---|---|---|---|
0 | 1 | ... | 1 | 1 |
... | ... | ... | ... | ... |
0 | 1 | ... | 1 | 1 |
where clicks = 1 is your target
and thousands of identical lines like
feature1 | feature2 | ... | clicks | impressions |
---|---|---|---|---|
0 | 1 | ... | 0 | 1 |
... | ... | ... | ... | ... |
0 | 1 | ... | 0 | 1 |
for lines where click had not happened.
Instead of this (for example, liblinear has this option) you can put weights
into your loss function and parameters updating, and use dataset like this only with 2 lines:
feature1 | feature2 | ... | clicks | impressions |
---|---|---|---|---|
0 | 1 | ... | 0 | x |
0 | 1 | ... | 1 | y |
where x and y are exactly weights
im talking about.