explorer
explorer copied to clipboard
Introduce `_with` APIs
The goal is to introduce filter_with
, summarize_with
, mutate_with
, arrange_with
, and distinct_with
.
Attack plan
- [x] Support
filter_with
with row-based series operations - [x] Support
summarize_with
with aggregation-based series operations - [x] Support
mutate_with
with row, group, and aggregation-based series operations - [ ] Support
arrange_with
- [ ] Support
distinct_with
- [ ] Decide on #224
This will unblock us to fully tackle #223, #227, and #245.
Complications
arrange
/distinct
introduce one particular issue. We have added the _with
prefix to disambiguate the macro-api from the non-macro API. This was easy because the non-macro API for mutate
/summarize
/filter
are function based. However, arrange
/distinct
already have a non-macro API that is not function based, for example:
arrange(df, desc: "my_field")
But we also want to support this:
arrange(df, desc: my_field)
We have three choices:
-
Keep
arrange(df, desc: "my_field")
andarrange(df, desc: my_field)
, under the same function/arity. This may be doable but it may also raise ambiguities. For example, should we allowarrange(df, desc: my_field, asc: "another-field")
? -
Move the non-macro API to
arrange_with
, which will support keywords or functions, such asarrange_with(df, desc: "my_field")
-
Remove the
arrange(df, desc: "my_field")
version. People can either usearrange(df, desc: my_field)
orarrange_with(df, fn df -> [desc: df["my_field"]] end)
EDIT: distinct
has further complications, because the columns are passed as options and we will have to revisit that.
@josevalim I'm going to start summarize_with
operations. I believe filter_with
can be considered done. WDYT?
cc/ @cigrainger