Michael Chow

Results 189 issues of Michael Chow

Based on the pandas issue posed in [this tweet](https://twitter.com/BenjaminWolfe/status/1293179672172339200). Here's a quick version of fill. ```python from pandas.core.groupby import DataFrameGroupBy from pandas import DataFrame from siuba.dply.verbs import singledispatch2 @singledispatch2((DataFrame, DataFrameGroupBy))...

type:feature
dplyr:parity
api:verb

Right now, a major limitation of joins is that the use a dictionary format behind the scenes. E.g. ```python {"left_key": "right_key", "left_key2": "right_key2"} ``` This works fine for 99% of...

type:feature
dplyr:parity
api:verb

This is an interesting case study, since * it doesn't have a grouped method * the ungrouped method largely operates on numpy array, BUT * it does a costly check...

type:feature
be:pandas-fast

See PR #238 with current work. Merging it into master, so I (or anyone interested ;) can work on translating the tidyr vignette (#291), at the same time as this...

type:feature
.needs-research
core:siuba
api:verb

Datasets todo: - [x] : relig_income - [x] : billboard - [x] : who - https://tidyr.tidyverse.org/articles/pivot.html#multiple-observations-per-row-1 - anscombe - fish_encounters (note, the rest of the vignette data is for pivot_wider)

type:documentation
time:5
type:tests

E.g. ```python from siuba.data import mtcars from siuba import _, rename mtcars >> rename(_.upper()) # equivalent to mtcars.rename(columns = lambda _: _.upper()) ``` The one challenge is deciding whether it...

.epic
.needs-research
api:verb
.api-user

* column operations - use pandas series methods - transformed for tools like SQL * verbs - singledispatch2 (rename to verb_dispatch?) - work on grouped and ungrouped data - don't...

type:documentation
.epic
impact:5

Working on better backend documentation. Autodoc does a good job pulling out type signatures for a singledispatch function, but I also want to concatenate all the docs for the different...

type:documentation
time:5

Currently join has a parameter named "on", which corresponds to pandas merge method. However dplyr uses the name "by". I am not against aligning with pandas methods, but this spot...

time:1
dplyr:parity

I'm pretty sure rowwise -> mutate is one of the most import patterns siuba could implement. This is because unlike in R, python functions are rarely vectorized. This means that...

.epic
api:verb