Wes McKinney comments

Results 203 comments of


                                            Wes McKinney

DEV: C++ exceptions vs. error codes

Yeah, having checked/unchecked functions is probably the easiest thing. You can use C++ exceptions for errors that you want to propagate to the user, but not for routine internal "failures"...

high performance JSON & NDJson parser

We have developed a parallel line-delimited JSON reader in Apache Arrow, see https://github.com/apache/arrow/blob/master/python/pyarrow/_json.pyx

First class array/list type

"First class" here means "not implemented using Python lists". You can interpret any array of type `T` as `Array[T]` by adding an array of offsets that encode size and position....

"Predicate pushdown" in group-bys

Yeah, the idea behind an "expression VM" is similar to the design of APL interpreters. This is a bigger topic than this issue, but normal pandas operations would be implemented...

require numexpr / numba

I'm -0 on this because these tools are NumPy-centric and do not have good support for non-numeric data.

Aligning Series.index with DataFrame.index in broadcasting operations

I honestly might even go so far as disabling implicit broadcasting in favor of `df.add(series, axis='index')`. Perennial source of problems.

@datnamer either way, pandas needs to have its own metadata implementation (see the logical/physical decoupling discussion in https://pydata.github.io/pandas-design/internal-architecture.html#logical-types-and-physical-storage-decoupling). We do not want to delegate metadata details to a third party...

Wes McKinney

DEV: C++ exceptions vs. error codes

high performance JSON & NDJson parser

First class array/list type

"Predicate pushdown" in group-bys

require numexpr / numba

Aligning Series.index with DataFrame.index in broadcasting operations

supported dtypes

Dtype strict mode

Make NA/null a first-class citizen in groupby operations

Aggregation identity on entirely missing data