datar
datar copied to clipboard
`KeyError: 0` when using `filter()` on dataframe with one column
I am trying to iterate over all columns of a dataframe and filter only those values that meet a condition, but I get a KeyError: 0
Code to replicate
mpg_location = 0
mtcars >> select(f[mpg_location]) >> filter(f[mpg_location]>21)
I am actually a little bit hesitant to allow selecting columns by indices with evaluating context.
Since indices wrapped by f with evaluating context are expanded into sequences (e.g. f[:3] to [0, 1, 2]), it could be confusing that part of the condition is like somefunc(f[:3]), then is somefunc() handling the first 3 columns of the data frame or [0, 1, 2]?
For a single number wrapped by f, rather than a slice, people may think that it should select a column instead of evaluating (as you could write 0 directly instead of f[0]). However, this could bring confusion as well. Why would f[1] select a column but f[:1] does not.
This confusion does not exist with selecting context, for example, df >> select(f[:3]), will anyway select the first 3 columns.
See also the dplyr's docs for select() and filter() about the context (tidyr-select and data-mask)
The best practice is to always use column names with evaluating context.