datar
datar copied to clipboard
`KeyError: 0` when using `filter()` on dataframe with one column
I am trying to iterate over all columns of a dataframe and filter
only those values that meet a condition, but I get a KeyError: 0
Code to replicate
mpg_location = 0
mtcars >> select(f[mpg_location]) >> filter(f[mpg_location]>21)
I am actually a little bit hesitant to allow selecting columns by indices with evaluating context.
Since indices wrapped by f
with evaluating context are expanded into sequences (e.g. f[:3]
to [0, 1, 2]
), it could be confusing that part of the condition is like somefunc(f[:3])
, then is somefunc()
handling the first 3 columns of the data frame or [0, 1, 2]
?
For a single number wrapped by f
, rather than a slice, people may think that it should select a column instead of evaluating (as you could write 0
directly instead of f[0]
). However, this could bring confusion as well. Why would f[1]
select a column but f[:1]
does not.
This confusion does not exist with selecting context, for example, df >> select(f[:3])
, will anyway select the first 3 columns.
See also the dplyr's docs for select()
and filter()
about the context (tidyr-select
and data-mask
)
The best practice is to always use column names with evaluating context.