Ian Rose
Ian Rose
add to allowlist
Also, cc @mroeschke, who might be interested in the weird performance characteristics around groupby/agg on categorical columns.
Hmm, seems as if this is a breaking change for people who are using `split_out > 1`. Perhaps the correct thing to do here is to raise a deprecation warning...
Okay, I've backed out the top-level API change here and replaced it with a `FutureWarning`. After a couple of releases, I'd say it would be okay to do the following:...
Thanks for the review and discussion @rjzamora!
Thanks for the info @mroeschke. > In the initial DataFrame, the unique categories are (string) ordered, but for groupby("cat_id", sort=False), the resulting categories need to be re-ordered based on how...
I should add, I'm very much in favor of having `from_delayed` take arrays or lists of delayeds.
Great, looking forward to seeing a PR @szwiep!
I feel like `from_array` does *most* of this, where instead of implementing a callable that produces a chunk, you implement a `__getitem__`. The major difference is that `from_map` is a...
> It also doesn't benefit from HLGs (unless I'm missing something). Right, though `from_array` does now use HLGs, cf. #7417. What I'm proposing is (I think) pretty similar to `from_map`,...