passport
passport copied to clipboard
non-logical datasets shouldn't start with "is_"
Currently is_developed
and is_independent
return character vectors:
unique(passport:::countries$is_developed)
#> [1] NA "Developed" "Developing"
unique(passport:::countries$is_independent)
#> [1] NA "Yes"
#> [3] "Territory of GB" "International"
#> [5] "Territory of US" "Part of NL"
#> [7] "Part of FI" "Part of FR"
#> [9] "Territory of NO" "Territory of AU"
#> [11] "Associated with NZ" "In contention"
#> [13] "Part of DK" "Crown dependency of GB"
#> [15] "Part of CN" "Commonwealth of US"
#> [17] "Territory of FR" "Territory of NZ"
#> [19] "Territories of US"
If they're going to start with is_
, they should really return logical vectors. To address the issue, they could
- drop information to actually return a logical
- get renamed
- be split in two, e.g.
is_independent
anddependency_status
None of these options is really ideal, as the expectation of as_country_code
and as_country_name
is usually to return a character vector or factor. They are not the only exceptions:
code_types <- sapply(passport:::countries, typeof)
code_types[code_types != 'character']
#> gaul un_region_code
#> "double" "integer"
#> un_subregion_code un_intermediate_region_code
#> "integer" "integer"
#> m49 ldc
#> "integer" "logical"
#> lldc sids
#> "logical" "logical"
Numeric country codes (gaul, un_*_code, m49) are a different issue. Perhaps they should be strings, as they should not be operated upon, but converting them to factors is potentially very confusing and may merit a warning or message.
Country groupings (ldc, lldc, sids, un_*_code) will be addressed by #1 (though they face the same type issue).
These two (plus a lot more) should be split into a separate set of country attributes (#3), but the issue will still have to be addressed within that dataset.
This will be a breaking change, but integrating the change with #3 will minimize disruption.