dm icon indicating copy to clipboard operation
dm copied to clipboard

Mention compound keys in "how-to-dm-df" tutorial

Open IndrajeetPatil opened this issue 2 years ago • 0 comments

Mention here about why compound keys are needed when no single candidate is found for a PK.

Motivating example:

library(dm)
library(nycflights13)

x <- new_dm(tibble::lst(weather, airports))
dm_enum_pk_candidates(x, "weather")
#> # A tibble: 15 × 3
#>    columns    candidate why                                                     
#>    <keys>     <lgl>     <chr>                                                   
#>  1 origin     FALSE     has duplicate values: JFK (8706), LGA (8706), EWR (8703)
#>  2 year       FALSE     has duplicate values: 2013 (26115)                      
#>  3 month      FALSE     has duplicate values: 5 (2232), 7 (2228), 3 (2227), 1 (…
#>  4 day        FALSE     has duplicate values: 3 (864), 7 (864), 8 (864), 9 (864…
#>  5 hour       FALSE     has duplicate values: 1 (1093), 5 (1092), 8 (1092), 11 …
#>  6 temp       FALSE     has duplicate values: 37.94 (521), 73.94 (509), 73.04 (…
#>  7 dewp       FALSE     has duplicate values: 28.94 (505), 53.96 (490), 32.00 (…
#>  8 humid      FALSE     has duplicate values: 100.00 (286), 54.51 (59), 93.08 (…
#>  9 wind_dir   FALSE     has duplicate values: 310 (1341), 0 (1256), 320 (1204),…
#> 10 wind_speed FALSE     has duplicate values: 9.20624 (2335), 8.05546 (2312), 6…
#> 11 wind_gust  FALSE     has 20778 missing values, and duplicate values: 23.0156…
#> 12 precip     FALSE     has duplicate values: 0.00 (24366), 0.01 (454), 0.02 (2…
#> 13 pressure   FALSE     has 2729 missing values, and duplicate values: 1016.2 (…
#> 14 visib      FALSE     has duplicate values: 10 (21847), 9 (808), 8 (581), 7 (…
#> 15 time_hour  FALSE     has duplicate values: 2013-01-01 01:00:00 (3), 2013-01-…

Created on 2022-07-08 by the reprex package (v2.0.1)

IndrajeetPatil avatar Jul 08 '22 07:07 IndrajeetPatil