srvyr icon indicating copy to clipboard operation
srvyr copied to clipboard

Variance-covariance matrix for a multivariate estimate

Open krivit opened this issue 5 years ago • 8 comments

survey::svymean() results an object of class svystat, which has a vcov() method to obtain not only the variances of the estimates but also their covariances. Is there a way to do that in srvyr?

As far as I can tell, the nearest replacement, srvyr::survey_mean() in summarize(), can take a matrix as its first argument, but it returns a tbl_df and does not provide a way to obtain covariances. Is there some other way to do that?

krivit avatar Apr 14 '20 20:04 krivit

There is a mention of the difference in the vignette (https://cran.r-project.org/web/packages/srvyr/vignettes/srvyr-vs-survey.html), but does that mean that the only way to obtain covariances is to go back to survey?

krivit avatar Apr 14 '20 20:04 krivit

Yes, that is currently true, leaving this open because I can imagine revisiting when I get my head around the advanced dplyr 1.0 features about returning multiple values, but this is a tricky issue.

gergness avatar Jul 18 '20 19:07 gergness

Struggling with what the return would look like. I can imagine a wrapper function that made this easier, but I don't think it's the right return. Something about covariance matrices just doesn't seem to fit within a data.frame to me.

library(srvyr)
data(api, package = "survey")
dstrata <- apistrat %>% as_survey(strata = stype, weights = pw)


dstrata %>%
  summarize(
    api99_mn = survey_mean(api99),
    api00_mn = survey_mean(api00),
    api_cov = list(vcov(survey::svymean(cur_svy()$variables[, c("api99", "api00")], cur_svy())))
  )
#>   api99_mn api99_mn_se api00_mn api00_mn_se
#> 1 629.3948    10.09699 662.2874    9.536132
#>                                   api_cov
#> 1 101.94914, 94.28401, 94.28401, 90.93782

gergness avatar Nov 29 '20 17:11 gergness

The way to prevent data.frame() from mangling lists of matrices is to enclose them in I(). For example,

data.frame(mat = I(list(diag(3), diag(2))))
#>            mat
#> 1 1, 0, 0,....
#> 2   1, 0, 0, 1

Alternatively, tibble doesn't mangle by default:

library(tibble)
tibble(mat = list(diag(3),diag(2)))
#> # A tibble: 2 x 1
#>   mat              
#>   <list>           
#> 1 <dbl[,3] [3 × 3]>
#> 2 <dbl[,2] [2 × 2]>

Is this what you are looking or?

krivit avatar Nov 29 '20 21:11 krivit

No, but like, isn't it weird to have a matrix stuffed in a column like that? There's nothing attaching the rows/columns to the data.frame

gergness avatar Nov 30 '20 01:11 gergness

Doesn't strike me as particularly weird. It was always possible, if a bit awkward, to put complex objects, including other data frames, into cells of a data frame.

krivit avatar Nov 30 '20 04:11 krivit

Can you place it in the attr() of that tibble/data frame?

skolenik avatar Dec 18 '21 14:12 skolenik

Theoretically yes, but without some design they’d be hard to access and not behave as you expect.

Do you have code samples of what you currently do with the survey package and/or what you wish srvyr did?

gergness avatar Dec 18 '21 17:12 gergness