margins icon indicating copy to clipboard operation
margins copied to clipboard

Columns overwritten summarizing margins w/ 'at'

Open leeper opened this issue 6 years ago • 4 comments
trafficstars

Please specify whether your issue is about:

  • [x] a possible bug
  • [ ] a question about package functionality
  • [ ] a suggested code or documentation change, improvement to the code, or feature request

If you are reporting (1) a bug or (2) a question about code, please supply:

  • a fully reproducible example using a publicly available dataset (or provide your data)
  • if an error is occurring, include the output of traceback() run immediately after the error occurs
  • the output of sessionInfo()

Put your code here:

library(margins)                                                                                                                                                                                                                                                                                
library(ggplot2)
library(microbenchmark)
                                                                                                                                                                                                                                                                                                                                
model <- lm(price ~ x + y + z + depth, data = diamonds)
summary(margins(model, at = prediction::mean_or_mode(diamonds[c("x","y","z","depth")])))
 factor      x      y       z   depth       AME      SE      p     lower     upper
  depth 5.7312 5.7345  4.1749 61.7494   27.0247  6.4732 0.0000   14.3375   39.7120
      x 5.7312 5.7345 65.3856 61.7494 2850.4446 43.5944 0.0000 2765.0012 2935.8881
      y 5.7312 5.7345  7.2645 61.7494  230.2577 31.6964 0.0000  168.1339  292.3815
      z 5.7312 5.7345  1.9961 61.7494  110.2523 55.2342 0.0459    1.9953  218.5094

Note how we lose the z statistic column b/c z is one of the at variables. We should cbind() to avoid this.

> sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] microbenchmark_1.4-6 ggplot2_3.1.0        margins_0.3.25      

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0        bindr_0.1.1       magrittr_1.5      MASS_7.3-51.1     tidyselect_0.2.5 
 [6] munsell_0.5.0     colorspace_1.4-0  R6_2.3.0          rlang_0.3.1       plyr_1.8.4       
[11] dplyr_0.7.8       tools_3.5.2       grid_3.5.2        data.table_1.12.0 gtable_0.2.0     
[16] withr_2.1.2       lazyeval_0.2.1    assertthat_0.2.0  tibble_2.0.1      prediction_0.3.10
[21] crayon_1.3.4      bindrcpp_0.2.2    purrr_0.3.0       glue_1.3.0        compiler_3.5.2   
[26] pillar_1.3.1      scales_1.0.0      pkgconfig_2.0.2

leeper avatar Feb 18 '19 17:02 leeper

  • [ ] Fix problem (probably here: https://github.com/leeper/margins/blob/master/R/summary.R#L14-L25)
  • [ ] Add warning about possible misbehavior
  • [ ] Note in documentation that variables corresponding to any of those columns should be avoided if possible
  • [ ] Add a test to ensure correct behavior

leeper avatar Feb 20 '19 11:02 leeper

I have tried trouble shooting the lines you suggested but can't seem to solve it. I was wondering if it isn't a problem with L27 in summary.R? Specifically:

out[, c("factor", at_names, names(out)[!names(out) %in% c("factor", at_names)]

Is this not excluding columns from out that have the same names as columns in at_names i.e. z in the example?

JohnRoelofse avatar Feb 22 '19 21:02 JohnRoelofse

Ooh, you're probably right. That's going to be tricky because it's hard to extract duplicate-named columns from a data frame.

leeper avatar Feb 24 '19 09:02 leeper

Should I add stop() with an error message? Or add in some logic that alters the model variable z to z1?

JohnRoelofse avatar Feb 26 '19 05:02 JohnRoelofse