Michael Chow

Results 189 issues of Michael Chow

in siuba.dply.verbs, lots of this repeated right now... ``` @unite.register(DataFrameGroupBy) def _unite_gdf(__data, *args, **kwargs): # TODO: consolidate these trivial group by dispatched funcs groupings = __data.grouper.groupings df = __data.obj f_unite...

time:1
.user-experience
.api-developer

Methods like fast_mutate, fast_summarize, and fast_filter offer a simple way for users to perform operations on a grouped DataFrame. Importantly, these work at around the same speed it would take...

.epic

**Note: if implemented, this change seems impactful enough that an ADR should be written for it** For example, in the example below I create a LazyTbl for mtcars data. However,...

.epic
type:research
core:siu

Currently, tidyselection in siuba works by returning a dictionary mapping : . [See code](https://github.com/machow/siuba/blob/main/siuba/dply/verbs.py#L556-L566) However, this creates creates a number of issues: * Duplicate names makes excluding a column impossible:...

.epic
time:5
impact:3
core:siuba

Seems likely that if we want to reduce the size of SQL syntax generated, then subsequent queries will need to know if the prev uses distinct. (Alternatively, we can probably...

.help wanted
be:sql
type:research

e.g. ``` from siuba.data import mtcars from siuba import * mtcars >> summarize(avg_mpg = _.mpg.mean(), avg_kpg = _.avg_mpg * 1.6) ```

type:feature
time:5
dplyr:parity
api:verb

This module implements convenience functions for working with pandas factors. As such, all of its behaviors are useful outside of siuba. Its responsibilities look like... **outside siuba**: convert Series/Cats (or...

.epic
core:siu
type:refactor

e.g. ``` df >> mutate(a = 1, a = 2, a = 3) ``` In python this produces a SyntaxError (meaning this code fails at parse time). To work around,...

type:feature
be:pandas
be:sql
time:5
dplyr:parity
api:verb