Michael Chow issues

Results 189 issues of


Michael Chow

consolidate when dispatching DataFrame function for a DataFrameGroupBy

in siuba.dply.verbs, lots of this repeated right now... ``` @unite.register(DataFrameGroupBy) def _unite_gdf(__data, *args, **kwargs): # TODO: consolidate these trivial group by dispatched funcs groupings = __data.grouper.groupings df = __data.obj f_unite...

time:1

.user-experience

.api-developer

Moving fast grouped methods out of experimental

Methods like fast_mutate, fast_summarize, and fast_filter offer a simple way for users to perform operations on a grouped DataFrame. Importantly, these work at around the same speed it would take...

.epic

An object should be able to raise an error, if it doesn't have a verb implementation

**Note: if implemented, this change seems impactful enough that an ADR should be written for it** For example, in the example below I create a LazyTbl for mtcars data. However,...

.epic

type:research

core:siu

Use column position rather than name in tidyselection

Currently, tidyselection in siuba works by returning a dictionary mapping : . [See code](https://github.com/machow/siuba/blob/main/siuba/dply/verbs.py#L556-L566) However, this creates creates a number of issues: * Duplicate names makes excluding a column impossible:...

.epic

time:5

impact:3

core:siuba

LazyTbl needs a distinct attribute?

Seems likely that if we want to reduce the size of SQL syntax generated, then subsequent queries will need to know if the prev uses distinct. (Alternatively, we can probably...

.help wanted

be:sql

type:research

allow referring to previously created columns in summarize

e.g. ``` from siuba.data import mtcars from siuba import * mtcars >> summarize(avg_mpg = _.mpg.mean(), avg_kpg = _.avg_mpg * 1.6) ```

type:feature

time:5

dplyr:parity

api:verb

make forcats.py its own library

This module implements convenience functions for working with pandas factors. As such, all of its behaviors are useful outside of siuba. Its responsibilities look like... **outside siuba**: convert Series/Cats (or...

.epic

core:siu

type:refactor

Support reassignment multiple times in mutate

e.g. ``` df >> mutate(a = 1, a = 2, a = 3) ``` In python this produces a SyntaxError (meaning this code fails at parse time). To work around,...

type:feature

be:pandas

be:sql

time:5

dplyr:parity

api:verb

SQL group ops - clean up assigning to LazyTbl.group_by

be:sql

time:1

type:refactor

sql as_type conversion should accept strings like pandas method can

type:feature

be:sql

time:3