pyglmnet icon indicating copy to clipboard operation
pyglmnet copied to clipboard

[WIP] Add GLM example with the Negative Binomial distribution.

Open geektoni opened this issue 4 years ago • 15 comments

This example is related to issue #386.

geektoni avatar Aug 16 '20 20:08 geektoni

Nice! can you explain somewhere why negative binomial is appropriate for this data?

jasmainak avatar Aug 16 '20 20:08 jasmainak

Nice! can you explain somewhere why negative binomial is appropriate for this data?

Sure. I'll expand the description of the example.

geektoni avatar Aug 16 '20 20:08 geektoni

I think this is clearer now.

geektoni avatar Aug 18 '20 09:08 geektoni

Excellent! I am going to try the code out later this afternoon :-)

jasmainak avatar Aug 18 '20 16:08 jasmainak

@geektoni I tried it. Something is off. When I plot the convergence:

>>> glm_neg_bino.plot_convergence()

I see this plot:

convergence

it should go down monotonically

this could explain why we don't see comparable results

jasmainak avatar Aug 18 '20 22:08 jasmainak

it should go down monotonically

this could explain why we don't see comparable results

i saw the same issue in convergence plots when i tried to reproduce the R example. had to tweak (increase) learning rates to get it go down monotonically.

pavanramkumar avatar Aug 19 '20 02:08 pavanramkumar

showing poisson vs neg bin fits in our example could be useful. i also like this example: https://data.library.virginia.edu/getting-started-with-negative-binomial-regression-modeling/

it talks about where poisson assumptions are lacking and how neg bin could be a better choice.

pavanramkumar avatar Aug 19 '20 02:08 pavanramkumar

i saw the same issue in convergence plots when i tried to reproduce the R example. had to tweak (increase) learning rates to get it go down monotonically.

that's a bit odd. I can imagine non-monotonic with larger learning rates but I don't understand why it would be the case for smaller learning rates?

jasmainak avatar Aug 19 '20 02:08 jasmainak

@geektoni I pushed a bunch of fixes. However, I noticed that the link function they use is log which is different from the one we use. So we can't really expect equivalent solutions. I'm wondering if we should wait on #390 to be merged and try with a log link function (should be a couple of lines of code). I think it would be a good validation exercise.

jasmainak avatar Aug 19 '20 03:08 jasmainak

showing poisson vs neg bin fits in our example could be useful.

this is a great idea too!

jasmainak avatar Aug 19 '20 03:08 jasmainak

@geektoni I pushed a bunch of fixes. However, I noticed that the link function they use is log which is different from the one we use. So we can't really expect equivalent solutions. I'm wondering if we should wait on #390 to be merged and try with a log link function (should be a couple of lines of code). I think it would be a good validation exercise.

I guess we could add another example (besides this one) in which we show how to tweak an existing model to use a different link function. However, I remember that, at first, we tried coding the negative binomial with a log link function but we had some problems of numerical instability (e.g., nans when computing the gradients).

I think it is fine if pyglmnet offers a model with a different link function than other libraries. We might not be able to compare everything one-to-one, but as long as the other tests/examples pass I think we can be fairly sure they do the same thing.

geektoni avatar Aug 19 '20 08:08 geektoni

showing poisson vs neg bin fits in our example could be useful.

this is a great idea too!

I've added the Poisson regression to the example. As usual, it plots the learned betas and the convergence plot.

geektoni avatar Aug 19 '20 08:08 geektoni

@geektoni it might be really helpful to reproduce exactly an example from another source (i.e., not pyglmnet). I tried with log link by rebasing this branch against the distribution branch locally. But it didn't change much for me. You could also try the same.

jasmainak avatar Aug 20 '20 22:08 jasmainak

showing poisson vs neg bin fits in our example could be useful. i also like this example: https://data.library.virginia.edu/getting-started-with-negative-binomial-regression-modeling/

it talks about where poisson assumptions are lacking and how neg bin could be a better choice.

@jasmainak @pavanramkumar I have added another example taken from the link above. This time it seems to me that we are getting the same results!

geektoni avatar Oct 14 '20 15:10 geektoni

Sounds promising, I'm going to look more closely tomorrow :)

jasmainak avatar Oct 15 '20 03:10 jasmainak