chris-b1 comments

Results 24 comments of


                                            chris-b1

trafficstars

Separate pd2.NaT for datetime vs timedelta

I believe the API in arrow/pandas2 is currently pushing in the opposite direction, using a unified NA scalar (e.g. below). However, it will probably be easier than it sounds because...

Separate pd2.NaT for datetime vs timedelta

Yeah, the vision has evolved over time, but my current (possibly incorrect) understanding is: - `arrow` - base, python agnostic, c++ layer, core memory layout & algos - https://github.com/apache/arrow/tree/master/cpp -...

"Predicate pushdown" in group-bys

On the mailing list, you mentioned the idea of an "expression VM", this feels like the kind of thing that would be nicely handled by that? Just making up an...

supported dtypes

Under possible - date with no time type (edit: on second thought, is this anything more than a `Period[D]`?)

Make NA/null a first-class citizen in groupby operations

It's linked in the top issue, but just for visibility, https://github.com/pydata/pandas/pull/12607 is a WIP PR that would add the `dropna` keyword arg to `groupby`.

More careful management of hash table allocations

xref https://github.com/pydata/pandas/issues/14273 from @ssanderson - this particular case could still be improved in pandas 1.0, but a good example of where hash table size can be problematic.

lazy array attributes

API question - what does it look like to opt-in to one of these checks? As a specific example, I've used this "optimization" a few times to speed up merges...

[windows] slow paste performance

I know very little about ffi in js, but potentially could use a similar approach to (BSD licensed) `pyperclip`. https://github.com/asweigart/pyperclip/blob/f02774f8ff8c9e5ae7d2e498053fe224d9ba0f74/pyperclip/windows.py#L25

Implement copy

Thanks, my motivating example is storing an `AcceleratedArray` in a DataFrame. For that, you don't necessarily need the underlying mutability, but do need a `copy` to prevent the index for...

MultiIndex with NaNs reported as non sorted

copying my comment from #19771 There is a fundamental problem of a leaky internal detail - default sort puts missing data in the last position, but the integer coding backing...