corrr icon indicating copy to clipboard operation
corrr copied to clipboard

Allowing .diagonal argument to vary in `colpair_map`

Open luifrancgom opened this issue 11 months ago • 0 comments

I know colpair_map is a solution in relation to #42. In the case of colpair_map we have the following for cor:

library(corrr)
colpair_map(.data = mtcars, .f = cor, .diagonal = NA)
#> # A tibble: 11 × 12
#>    term     mpg    cyl   disp     hp    drat     wt    qsec     vs      am
#>    <chr>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>
#>  1 mpg   NA     -0.852 -0.848 -0.776  0.681  -0.868  0.419   0.664  0.600 
#>  2 cyl   -0.852 NA      0.902  0.832 -0.700   0.782 -0.591  -0.811 -0.523 
#>  3 disp  -0.848  0.902 NA      0.791 -0.710   0.888 -0.434  -0.710 -0.591 
#>  4 hp    -0.776  0.832  0.791 NA     -0.449   0.659 -0.708  -0.723 -0.243 
#>  5 drat   0.681 -0.700 -0.710 -0.449 NA      -0.712  0.0912  0.440  0.713 
#>  6 wt    -0.868  0.782  0.888  0.659 -0.712  NA     -0.175  -0.555 -0.692 
#>  7 qsec   0.419 -0.591 -0.434 -0.708  0.0912 -0.175 NA       0.745 -0.230 
#>  8 vs     0.664 -0.811 -0.710 -0.723  0.440  -0.555  0.745  NA      0.168 
#>  9 am     0.600 -0.523 -0.591 -0.243  0.713  -0.692 -0.230   0.168 NA     
#> 10 gear   0.480 -0.493 -0.556 -0.126  0.700  -0.583 -0.213   0.206  0.794 
#> 11 carb  -0.551  0.527  0.395  0.750 -0.0908  0.428 -0.656  -0.570  0.0575
#> # ℹ 2 more variables: gear <dbl>, carb <dbl>
colpair_map(.data = mtcars, .f = cor, .diagonal = 1)
#> # A tibble: 11 × 12
#>    term     mpg    cyl   disp     hp    drat     wt    qsec     vs      am
#>    <chr>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>
#>  1 mpg    1     -0.852 -0.848 -0.776  0.681  -0.868  0.419   0.664  0.600 
#>  2 cyl   -0.852  1      0.902  0.832 -0.700   0.782 -0.591  -0.811 -0.523 
#>  3 disp  -0.848  0.902  1      0.791 -0.710   0.888 -0.434  -0.710 -0.591 
#>  4 hp    -0.776  0.832  0.791  1     -0.449   0.659 -0.708  -0.723 -0.243 
#>  5 drat   0.681 -0.700 -0.710 -0.449  1      -0.712  0.0912  0.440  0.713 
#>  6 wt    -0.868  0.782  0.888  0.659 -0.712   1     -0.175  -0.555 -0.692 
#>  7 qsec   0.419 -0.591 -0.434 -0.708  0.0912 -0.175  1       0.745 -0.230 
#>  8 vs     0.664 -0.811 -0.710 -0.723  0.440  -0.555  0.745   1      0.168 
#>  9 am     0.600 -0.523 -0.591 -0.243  0.713  -0.692 -0.230   0.168  1     
#> 10 gear   0.480 -0.493 -0.556 -0.126  0.700  -0.583 -0.213   0.206  0.794 
#> 11 carb  -0.551  0.527  0.395  0.750 -0.0908  0.428 -0.656  -0.570  0.0575
#> # ℹ 2 more variables: gear <dbl>, carb <dbl>

Created on 2024-02-27 with reprex v2.1.0

While this approach makes sense for the sample correlation, it's not suitable for the sample covariance (cov).

library(corrr)
colpair_map(.data = mtcars, .f = cov, .diagonal = NA)
#> # A tibble: 11 × 12
#>    term      mpg     cyl   disp      hp     drat      wt     qsec       vs
#>    <chr>   <dbl>   <dbl>  <dbl>   <dbl>    <dbl>   <dbl>    <dbl>    <dbl>
#>  1 mpg     NA     -9.17  -633.  -321.     2.20    -5.12    4.51     2.02  
#>  2 cyl     -9.17  NA      200.   102.    -0.668    1.37   -1.89    -0.730 
#>  3 disp  -633.   200.      NA   6721.   -47.1    108.    -96.1    -44.4   
#>  4 hp    -321.   102.    6721.    NA    -16.5     44.2   -86.8    -25.0   
#>  5 drat     2.20  -0.668  -47.1  -16.5   NA       -0.373   0.0871   0.119 
#>  6 wt      -5.12   1.37   108.    44.2   -0.373   NA      -0.305   -0.274 
#>  7 qsec     4.51  -1.89   -96.1  -86.8    0.0871  -0.305  NA        0.671 
#>  8 vs       2.02  -0.730  -44.4  -25.0    0.119   -0.274   0.671   NA     
#>  9 am       1.80  -0.466  -36.6   -8.32   0.190   -0.338  -0.205    0.0423
#> 10 gear     2.14  -0.649  -50.8   -6.36   0.276   -0.421  -0.280    0.0766
#> 11 carb    -5.36   1.52    79.1   83.0   -0.0784   0.676  -1.89    -0.464 
#> # ℹ 3 more variables: am <dbl>, gear <dbl>, carb <dbl>

Created on 2024-02-27 with reprex v2.1.0

Having NA or a constant value on the diagonal is not ideal because the desired information here is the sample variance, which is specific to each variable. Instead, users expect a complete sample covariance matrix, similar to the output obtained using:

cov(mtcars)
#>              mpg         cyl        disp          hp         drat          wt
#> mpg    36.324103  -9.1723790  -633.09721 -320.732056   2.19506351  -5.1166847
#> cyl    -9.172379   3.1895161   199.66028  101.931452  -0.66836694   1.3673710
#> disp -633.097208 199.6602823 15360.79983 6721.158669 -47.06401915 107.6842040
#> hp   -320.732056 101.9314516  6721.15867 4700.866935 -16.45110887  44.1926613
#> drat    2.195064  -0.6683669   -47.06402  -16.451109   0.28588135  -0.3727207
#> wt     -5.116685   1.3673710   107.68420   44.192661  -0.37272073   0.9573790
#> qsec    4.509149  -1.8868548   -96.05168  -86.770081   0.08714073  -0.3054816
#> vs      2.017137  -0.7298387   -44.37762  -24.987903   0.11864919  -0.2736613
#> am      1.803931  -0.4657258   -36.56401   -8.320565   0.19015121  -0.3381048
#> gear    2.135685  -0.6491935   -50.80262   -6.358871   0.27598790  -0.4210806
#> carb   -5.363105   1.5201613    79.06875   83.036290  -0.07840726   0.6757903
#>              qsec           vs           am        gear        carb
#> mpg    4.50914919   2.01713710   1.80393145   2.1356855 -5.36310484
#> cyl   -1.88685484  -0.72983871  -0.46572581  -0.6491935  1.52016129
#> disp -96.05168145 -44.37762097 -36.56401210 -50.8026210 79.06875000
#> hp   -86.77008065 -24.98790323  -8.32056452  -6.3588710 83.03629032
#> drat   0.08714073   0.11864919   0.19015121   0.2759879 -0.07840726
#> wt    -0.30548161  -0.27366129  -0.33810484  -0.4210806  0.67579032
#> qsec   3.19316613   0.67056452  -0.20495968  -0.2804032 -1.89411290
#> vs     0.67056452   0.25403226   0.04233871   0.0766129 -0.46370968
#> am    -0.20495968   0.04233871   0.24899194   0.2923387  0.04637097
#> gear  -0.28040323   0.07661290   0.29233871   0.5443548  0.32661290
#> carb  -1.89411290  -0.46370968   0.04637097   0.3266129  2.60887097

Created on 2024-02-27 with reprex v2.1.0

It will be possible to display the diagonal elements of the covariance matrix by adding an option to the .diagonal argument?

luifrancgom avatar Feb 27 '24 05:02 luifrancgom