pegasos comparison with SGDClassifier

trafficstars

Hey. Did you compare with SGDClassifier? The results should be quite close to yours.

Jul 30 '13 16:07 amueller

I will compare, the original paper did some comparisons with SGD (not sklearn's implementation) and they found that the projection step and adaptive learning rate improved performance.

Jul 30 '13 17:07 ejlb

The SGD in scikit-learn actually has an adaptive learning rate - it can even be set to be the same as pegasos, I believe. For the projection step, the claims are much milder in the journal version of the paper and in the source code they provide it is commented out. I have not seen a careful analysis of the projection step, though, and would be quite interested in that.

Jul 30 '13 17:07 amueller

After looking it up again, I think you need to set power_t=1 to get the pegasos schedule.

Jul 30 '13 17:07 amueller

Here are some benchmarks with identical learning rates:

https://raw.github.com/ejlb/pegasos/master/benchmarks/benchmarks.png

Pegasos seems to be slightly more accurate (1%). The only two differences I know of are:

pegasos projection
pegasos trains on random examples so may get a better generalisation error.

Due to point 2) it is hard to compare speed across iterations.

Aug 06 '13 09:08 ejlb

Wow that looks quite good. I'm quite surprised your implementation is significantly faster than sklearn. Do you have any idea where that could come from? Also, could you please share your benchmark script?

cc @pprett @larsmans

Aug 06 '13 09:08 amueller

You say that training on random samples makes it had to compare speed.s How so? One iteration of sgd are n_samples many updates, which you should compare against n_samples many updates in pegasos. Or did you compare against single updates here?

Aug 06 '13 09:08 amueller

@amueller SGDClassifier trains on the whole data set at each iteration I assume? It is probably where the speed increase comes from

edit: yes true, that would be a good comparison. Will upload the benchmark script

Aug 06 '13 09:08 ejlb

Ok, but then the plot doesn't make sense. You should rescale it such that the number of weight updates is the same.

Aug 06 '13 09:08 amueller

Yeah, will run some with equal weight updates

Aug 06 '13 10:08 ejlb

Yes, SGDClassifier does

for i in xrange(n_iter):
    shuffle(dataset)
    for x in X:
         update()

It also wastes a little bit of time in each update, checking whether it should do a PA update or a vanilla additive one.

Aug 06 '13 10:08 larsmans

this makes much more sense:

https://raw.github.com/ejlb/pegasos/master/benchmarks/weight_updates/benchmarks.png

Perhaps batching the pegasos weight updates would retain the slight accuracy boost and improve the training time

Aug 06 '13 10:08 ejlb

Yeah, that looks more realistic ;) How did you set alpha and did you set eta0 in the SGD?

Aug 06 '13 21:08 amueller

I used this: SGDClassifier(power_t=1, learning_rate='invscaling', n_iter=sample_coef, eta0=0.01). The full benchmark is here: https://github.com/ejlb/pegasos/blob/master/benchmarks/weight_updates/benchmark.py

Aug 07 '13 08:08 ejlb

pegasos pegasos copied to clipboard

comparison with SGDClassifier

pegasos
pegasos copied to clipboard