marginaleffects icon indicating copy to clipboard operation
marginaleffects copied to clipboard

Pull request to support type="link" for nnet::multinom models

Open tomwenseleers opened this issue 1 year ago • 26 comments

This adds type="link" for nnet::multinom models (as is provided by predict.mblogit, with logits of reference category normalized to zero & dropped); it also fixes a bug for type="probs" (this was sometimes dropping the reference category, but not always - so that was clearly a bug in predict.multinom) and is now deriving type="latent" in a more elegant way by row centering the logits obtained with type="link". Some tests and comparison with emmeans results and binomial GLMs are given in sandbox/test get_predict multinom.R, including some code showing how to obtain CIs using the Delta method, or even better, by backtransformation from the link scale. If backtransformed CIs would later be natively supported it could be easier if I would make type="link" for nnet::multinom and mclogit::mblogit return the reference response level as an explicit zero column, so that all response levels would come out for all prediction types, as backtransformation for both type="link" and type="latent" would then merely require taking the softmax transform (by rowid). Right now I didn't do this, but happy to change this later if the need would arise.

This solves issue https://github.com/vincentarelbundock/marginaleffects/issues/477

tomwenseleers avatar Sep 03 '22 02:09 tomwenseleers

Thanks!

Do you have an example of the different behaviors for that "probs" bug?

I'll look into this as soon as possible, but it's going to take me a little while. Thanks for your patience.

vincentarelbundock avatar Sep 03 '22 02:09 vincentarelbundock

No worries, take your time, there's no rush! Below is an example to show that with some datasets nnet:::predict.multinom with type="probs" drops the reference level, while for others it gives all outcome levels. One other thing I wasn't 100% sure of is whether predict methods are supposed to accept factors coded as character. Right now I believe I made it that it assumes factors are already correctly coded in whatever newdata one passes, while datagridcf I believe returns a dataframe with factors coded as character (though maybe that behaviour should be changed, or before passing anything to get_predict, character could be correctly converted to factor?).

library(nnet)
library(mclogit)

# EXAMPLE 1

# Dataset: covid variant frequencies ("variant") through time ("collection_date_num") ####
# "count" = actual count per week
dat = read.csv("https://www.dropbox.com/s/u27cn44p5srievq/dat.csv?dl=1")
# simplify example to just 3 categories:
dat = dat[dat$variant %in% c("Alpha", "Beta", "Other"),] 
dat$variant = factor(dat$variant, levels=c("Other", "Beta", "Alpha"))


set.seed(1)
fit_multinom = nnet::multinom(variant ~ ns(collection_date_num, df=2), 
                              weights=count, data=dat)

# here predict.multinom returns predictions for all 3 response levels
nnet:::predict.multinom(fit_multinom, type="probs")[160,]
#      Other       Beta      Alpha 
# 0.86628084 0.01791583 0.11580333 

# EXAMPLE 2

data(mtcars)
dat = mtcars
dat$cyl = factor(dat$cyl)
dat$am = factor(dat$am)

fit1_multinom = multinom(am ~ mpg + cyl, data = dat)

newdata = data.frame(mpg = c(21), cyl = factor(c(4), levels=levels(dat$cyl)))

# here predict of nnet::multinom with type="probs" drops reference levels
nnet:::predict.multinom(fit1_multinom, newdata=newdata, type="probs")
# 1 
# 0.36026 

# nnet:::predict.mblogit with type="response" always returns all outcome levels
# I fixed get_predict.multinom to display the same behaviour for nnet::multinom models

tomwenseleers avatar Sep 03 '22 08:09 tomwenseleers

My intuition is that we should expect the columns in newdata to be of the same class as the columns in the original data used to fit the model. How does the original predict method behave?

Also, the tests should not be stored in sandbox. They have to be added to the end of inst/tinytest/test-pkg-net.R.

They should not just print the results; they should use tinytest expectations (you'll see examples in the file; but it's easy). That way, every time a change is made, the tests will be run and checked automatically.

vincentarelbundock avatar Sep 03 '22 11:09 vincentarelbundock

I would not call your example a "bug". With binary outcome, multinom just returns one probability. Since there is only one other possibility, and probabilities sum to 1, the prob for the other category is obviously the complement, and it includes no additional information.

vincentarelbundock avatar Sep 03 '22 11:09 vincentarelbundock

Given that, let's use the default ˋpredict` method for "probs"

vincentarelbundock avatar Sep 03 '22 11:09 vincentarelbundock

Well but the reference category probability is always 1-the sum of the probabilities for the other categories, by definition, no matter whether you have 2 or >2 outcome levels. And both predict.mblogit and emmeans always return all reference levels, no matter whether you have 2 or >2 outcome levels. Plus I think mblogit & multinom models should behave the same way, and multinom models should also always behave consistently, not dropping the reference level when one has 2 outcome levels, but not doing that when one has >2. Well, that's my feeling anyway...

tomwenseleers avatar Sep 03 '22 11:09 tomwenseleers

And yes sorry I just temporarily stored those quick tests in that sandbox to show to you that they work. Do you need additional tests actually on top of the ones you already had in inst/tinytest/test-pkg-net.R? Maybe no extra tests are needed... Also hard to say what is the correct result anyway - e.g. via backtransformation should be best & I showed in that script how to do that, but this has not been implemented in other software like emmeans (that one just uses naive symmetrical Delta method CIs).

tomwenseleers avatar Sep 03 '22 11:09 tomwenseleers

You make a good point about consistency. I mostly fear that our method will not be as robust as the default method, because nnet::predict.multinom does a lot more pre and post-processing stuff than we do. So I'm afraid that we'll hit some corner cases that we can't anticipate (ex: related to variable classes). I much prefer the "guarantee" of correct results that we get by relying on long-standing functions from the nnet package that a consistency that only adds redundant information (P(Y=1)+P(Y=0)=1).

Yes, I'd eventually like to reach 100% test coverage, so all new funtionnality should have tests:

- binomial vs multinomial outcome
- one vs. multi-row `newdata`
- model with factors/characters and numerics
- model in-formula transformation
- comparisons to `emmeans` for `type="link"` and `type="latent"` in both `multinom` and `mblogit`

Edit: if emmeans does not do backtransforms, then we don't include that in the tests. But the naive CI should be compared.

vincentarelbundock avatar Sep 03 '22 11:09 vincentarelbundock

One reason I would prefer to have all outcome levels returned is that transforming between the latent & response scale is a hell easier if they all come out & always come out the same way (as in emmeans). That was in fact also the reason I was tempted to make type="link" return an explicit zero column for the reference level (that type is also not supported by emmeans btw, emmeans only has latent & response for multinomial models). So this partly depends on what your plans are further down the road in supporting automatic transformation between the link / latent scale and the response scale for multinomial models (and other models like ordinal polr models - in emmeans they also have a lot of extra types defined actually not defined in the default predict method).

I could perhaps have another look at what exactly predict.nnet still does on top of what I do?

I should say I have near zero experience with unit tests (I had to add ".//inst//tinytest//" to the testing_path paths to get the current tests to run - which is probably more because of lack of experience of how this is supposed to run), but I'll try to take a look...

tomwenseleers avatar Sep 03 '22 12:09 tomwenseleers

Right. I'm not saying to drip the level for latent or link. I'm saying: use the original "probs". Users should never have to transform that output type.

vincentarelbundock avatar Sep 03 '22 12:09 vincentarelbundock

Just open the inst/tinytest/test-pkg-net.R file. It's super simple.

vincentarelbundock avatar Sep 03 '22 12:09 vincentarelbundock

Yes I ran inst/tinytest/test-pkg-net.R - but as I mentioned I could only run it if I added ".//inst//tinytest//" in front of all the testing_path paths - not sure why that was... And OK I'll leave the type="probs" as it is then... The reference level is currently dropped for type="latent" - should I change that, and make it return an explicit zero column for the reference level then? I think that would be better & would make it easier to transform to the response scale using a softmax transform, without having to add a zero at that stage...

tomwenseleers avatar Sep 03 '22 13:09 tomwenseleers

You probably just need to setwd() but that's not a big deal.

"latent" is a whole new type, so feel free to do whatever you think makes the most sense there.

Thanks!

vincentarelbundock avatar Sep 03 '22 13:09 vincentarelbundock

Ha OK I thought that for a package one always assumed the project root directory to be the working directory... If we would like the reference level to come out as zero for type="link" this would mean changing the predict.mblogit where that is now dropped. Would that be OK for you? We are working with a wrapper predict method anyway, so is it a big deal to redefine some of these predict methods? latent is already OK - there all levels come out...

tomwenseleers avatar Sep 03 '22 15:09 tomwenseleers

You just want to add a 0 column to the pred matrix before reshaping?

What does emmeans return with type="link"? Maybe I'm still confused by link vs latent.

vincentarelbundock avatar Sep 03 '22 15:09 vincentarelbundock

Yes indeed, because then one can just take the softmax transform of the predicted values (by rowid) to get back to the response scale. emmeans doesn't have type=link, it only has type=latent and type=response. latent is just the row centered version of link (but with the 0 for the reference level explicitly included).

tomwenseleers avatar Sep 03 '22 15:09 tomwenseleers

Great, thanks. Adding 0 makes sense, then.

vincentarelbundock avatar Sep 03 '22 15:09 vincentarelbundock

For type=probs I can maybe do something similar - use the original multinom predict method, but add the probability for the reference level if it's dropped? Then it would behave the same as predict.mblogit? Then we would have all response levels coming out for all predict types... I can then maybe also call it type response, while I'm at it?

tomwenseleers avatar Sep 03 '22 16:09 tomwenseleers

No thanks. I prefer to keep the original type names, because this is what I do for all other models.

vincentarelbundock avatar Sep 03 '22 16:09 vincentarelbundock

Also, it's trivial to do function(x) exp(x) / sum(exp(c(0, x)), so the convenience argument does not seem very strong. And we could eventually implement a shortcut in transform_post to make this all automatic anyway.

vincentarelbundock avatar Sep 03 '22 16:09 vincentarelbundock

OK so what do you prefer - that I add a zero column for type link or not? I feel it's more elegant to do it at that stage than after the fact, as the softmax backtransform would then be the same for latent and link. But up to you...

tomwenseleers avatar Sep 03 '22 17:09 tomwenseleers

Yep, that's fine

vincentarelbundock avatar Sep 03 '22 17:09 vincentarelbundock

To give you a bit of an update on this: so in the terminology that they use in compositional data analysis, a multinomial model is fitted on the additive logratio scale (with one outcome level coded as the reference level), emmeans uses the centered logratio scale (logits for all outcome levels, but centered so that they all add up to zero, it calls that one "latent") and aside from that there is two other scale one could use: the isometric logratio scale (which is an orthogonalized version of the centered logratio scale, projected onto the simplex, to resolve the collinearity that one ends up with in the centered logratio scale, i.e. if the proportions for one category go up, the other ones will go down, https://stats.stackexchange.com/questions/259208/how-to-perform-isometric-log-ratio-transformation) and the logit scale (used for regular binomial GLMs). Apparently the 95% confidence intervals calculated on the isometric logratio scale & then backtransformed to the response scale behave best - they closely match the ones that I obtained through simulation by resampling from the multivariate normal distribution of my parameters fitted on the additive logratio scale. Theoretically speaking that's apparently expected: https://www.dropbox.com/s/saiv4wlkxu0sq91/fox_andersen_2016_effect%20plots%20multinomial%20delta%20method%20on%20logit%20scale.pdf?dl=1. The 95% confidence intervals calculated on a logit scale & then backtransformed, however, also seem to behave reasonably well & the forward & backward transformation there is much easier (bit of a pain for isometric logratio, though doable). The effects package apparently does it that way: https://www.dropbox.com/s/saiv4wlkxu0sq91/fox_andersen_2016_effect%20plots%20multinomial%20delta%20method%20on%20logit%20scale.pdf?dl=1. To accomodate these different possibilities I had now incorporated type="response" (probability scale), type="clr" (="centered logratio" = "latent" in emmeans), type="ilr" (="isometric logratio") and type="logit" (type="link" in the Effects package). But intuitively I didn't understand why the logit scale behaved better e.g. than the centered logratio scale. So it's all a little more complicated than I thought. I should probably consult first with some statisticians specialised in compositional data analysis. Also still need to write those unit tests.

tomwenseleers avatar Sep 19 '22 14:09 tomwenseleers

Thanks a lot for all your work and research on this issue!

So it's all a little more complicated than I thought. I should probably consult first with some statisticians specialised in compositional data analysis. Also still need to write those unit tests.

Frankly, this makes me very nervous. I do not currently have the expertise, or more importantly the time to invest to check the statistical validity of this work.

The main reason I did not want to add new predict types is that there are major benefits in terms correctness in delegating this part of the implementation to the subject matter experts who wrote the original modeling packages.

At this point, the best way forward might actually be for me to prepare a robust and well-documented extension mechanism. Authors like you could then write a simple function to add new support, and publish them on Github or other.

I could maintain a list of links for users interested in using those extensions.

vincentarelbundock avatar Sep 19 '22 14:09 vincentarelbundock

Well it's just the case that other packages do it in different ways - the Effects package e.g. constructs confidence intervals on the logit scale & then backtransforms for multinomial model, while the emmeans package constructs naive Delta method confidence intervals on the response scale - I did verify that the results I obtained matched those results... The confidence intervals calculated on logit scale & then backtransformed in any case would already be much better than the current Delta method confidence intervals on the response scale produced by marginaleffects - in that sense it would definitely be progress... I am also 100% sure that the predictions I calculate on these different scales are correct (this is very straightforward) - the only thing I am not 100% sure about right now is what scale confidence intervals should ideally be constructed on (I think isometric logratio, but as 2nd best just logit). What you suggest: allowing for users to register their own methods / extensions I like. That's what the emmeans package does: it has a nicely documented and straightforward way to allow users to write their own extensions enabling package authors to use emmeans without having to make changes to the source code of emmeans itself - something similar would be nice for marginaleffects too: https://cran.r-project.org/web/packages/emmeans/vignettes/xtending.html (other packages like broom etc also went in that direction).

tomwenseleers avatar Sep 20 '22 21:09 tomwenseleers

@tomwenseleers

I see that you're still pushing commits. Maybe you want to hold off a bit. I'm thinking about an extension mechanism that will make all of our lives much easier.

vincentarelbundock avatar Sep 26 '22 16:09 vincentarelbundock

Thanks again for your work on this! See here: https://github.com/vincentarelbundock/marginaleffects/issues/490

I propose that we create a "library" of useful, user-supplied extensions in the "extensions" vignette. The code in this PR would be a really nice fit to get us started on this project. Let me know what you think in this other thread.

vincentarelbundock avatar Oct 20 '22 12:10 vincentarelbundock