Joris Van den Bossche

Results 844 comments of Joris Van den Bossche

Just ran into a quite confusing issue debugging one of our tests: ``` In [23]: import pandas._testing as tm In [32]: index = tm.makeUnicodeIndex(3) In [33]: index Out[33]: Index(['3נפ583עדני', 'צ2מןץעס730',...

To be fair, it's also from our side that the feedback is limited / unclear. We should maybe update https://github.com/pandas-dev/pandas/issues/46843 to be clearer about what the issue actually proposes the...

> is that the preferred longer term fix and this a targeted PR for backport? That's indeed the preferred long term to change that behaviour, and this PR restores the...

@Th3nn3ss thanks a lot for that google doc! That's a useful overview, and I left a few comments. In general, I think it's certainly useful to start adding tests for...

> [@rhshadrach] I believe this issue is meant only for methods that take a UDF (e.g. agg, apply, transform), is that correct? For other methods (e.g. sum, mean, fillna) I...

For future reference, the specific example of `sample()` with an empty dataframe has been tackled in https://github.com/pandas-dev/pandas/pull/48484 > For a UDF on the other hand, we do not know the...

Not strictly specific to this PR (it's already existing behaviour), but noticed from looking at the changes here: the resulting `out` has the same type as the input `values` (both...

(the same also applies to `prod` https://github.com/pandas-dev/pandas/pull/48027)

Ah, yes, it's indeed a behaviour change that it can now overflow (because of not casting to float before the algo). We currently try to cast back, and that can...

Yes, it is certainly true that because of the grouping, you might less easily run into overflow. Although with sufficiently large data / few large groups, I think in practice...