Michael Chow
Michael Chow
in siuba.dply.verbs, lots of this repeated right now... ``` @unite.register(DataFrameGroupBy) def _unite_gdf(__data, *args, **kwargs): # TODO: consolidate these trivial group by dispatched funcs groupings = __data.grouper.groupings df = __data.obj f_unite...
Methods like fast_mutate, fast_summarize, and fast_filter offer a simple way for users to perform operations on a grouped DataFrame. Importantly, these work at around the same speed it would take...
**Note: if implemented, this change seems impactful enough that an ADR should be written for it** For example, in the example below I create a LazyTbl for mtcars data. However,...
Currently, tidyselection in siuba works by returning a dictionary mapping : . [See code](https://github.com/machow/siuba/blob/main/siuba/dply/verbs.py#L556-L566) However, this creates creates a number of issues: * Duplicate names makes excluding a column impossible:...
Seems likely that if we want to reduce the size of SQL syntax generated, then subsequent queries will need to know if the prev uses distinct. (Alternatively, we can probably...
e.g. ``` from siuba.data import mtcars from siuba import * mtcars >> summarize(avg_mpg = _.mpg.mean(), avg_kpg = _.avg_mpg * 1.6) ```
This module implements convenience functions for working with pandas factors. As such, all of its behaviors are useful outside of siuba. Its responsibilities look like... **outside siuba**: convert Series/Cats (or...
e.g. ``` df >> mutate(a = 1, a = 2, a = 3) ``` In python this produces a SyntaxError (meaning this code fails at parse time). To work around,...