countrycode
countrycode copied to clipboard
deal with tibbles and 1 column data frames gracefully
We could add here something like...
if (mode(sourcevar) == "list" && length(sourcevar) == 1) {
sourcevar <- sourcevar[[1]]
}
to essentially cast a single column tibble or data frame (or single element list) to a vector and avoid the following warning, i.e. sourcevar must be a character or numeric vector. This error often arises when users pass a tibble (e.g., from dplyr) instead of a column vector from a data.frame (i.e., my_tbl[, 2] vs. my_df[, 2] vs. my_tbl[[2]])
.
Not 100% sure that's the best thing, so leaving this here simply as an idea.
This seems like a good thing overall. Two questions:
- How can we be as explicit as possible w.r.t. tibbles?
- Is it desirable to output data in the same format that we got it in, or would it be better for countrycode to settle on a single output format that is well documented? If the former, do we also insert some code at then end to reformat?
-
"tbl_df" %in% class(sourcevar)
maybe? not 100% explicit, but pretty close (I suppose if someone created a class on top oftbl_df
and added other charcateristics this would still assume it's atbl_df
) -
I think it would be best to always return a single vector, which can easily be extended by the user for whatever structure they need. This proposal would simply work around an unintended passing of a
sourcevar
that doesn't quite conform to what's necessary, but can be obviously cast into something appropriate... rather than trying to open up a whole new realm of converting various different data structures.
It could also throw a mandatory warning that tells the user that it's converting sourcevar
so eventually they might learn that they should probably do it themselves first.
That all sounds very good.
I've now dog-fooded the package a fair amount, and I'm thinking about preparing a release for CRAN using the major improvements we've implemented.
Do you think we should sneak this change in before I hit "GO"? Any other easy changes to make before release?
There are a few regex improvements that I could probably get around to in the next week or so.
Cool. No rush.
My thinking on this has evolved.
I now believe that we should not convert inputs automagically, but that we should rather build in input checks and exit gracefully with a warning and instructions on common fixes.