datar icon indicating copy to clipboard operation
datar copied to clipboard

`KeyError: 0` when using `filter()` on dataframe with one column

Open rleyvasal opened this issue 2 years ago • 1 comments

I am trying to iterate over all columns of a dataframe and filter only those values that meet a condition, but I get a KeyError: 0

Code to replicate

mpg_location = 0
mtcars >> select(f[mpg_location]) >> filter(f[mpg_location]>21)

rleyvasal avatar Mar 15 '22 02:03 rleyvasal

I am actually a little bit hesitant to allow selecting columns by indices with evaluating context.

Since indices wrapped by f with evaluating context are expanded into sequences (e.g. f[:3] to [0, 1, 2]), it could be confusing that part of the condition is like somefunc(f[:3]), then is somefunc() handling the first 3 columns of the data frame or [0, 1, 2]?

For a single number wrapped by f, rather than a slice, people may think that it should select a column instead of evaluating (as you could write 0 directly instead of f[0]). However, this could bring confusion as well. Why would f[1] select a column but f[:1] does not.

This confusion does not exist with selecting context, for example, df >> select(f[:3]), will anyway select the first 3 columns. See also the dplyr's docs for select() and filter() about the context (tidyr-select and data-mask)

The best practice is to always use column names with evaluating context.

pwwang avatar Mar 15 '22 22:03 pwwang