Niels Bantilan comments

Results 468 comments of


                                            Niels Bantilan

trafficstars

Pandas2 / pyarrow backend support

@aaravind100 the overall approach makes sense! Thanks for taking the initiative on this. ```python @Engine.register_dtype(equivalents=["int", pd.ArrowDtype(pyarrow.int64())]) ``` Let's avoid overloading "int" here since it's already taken by the numpy int...

Pandas2 / pyarrow backend support

@mattharrison you'll be pleased to learn that #1628 has been merged :) the 0.20.0 release will have these changes. will probably cut a beta release in the next week or...

Pandera import fails due to DatetimeAccessor issue in Python 3.11.9

Since https://github.com/dask/dask/pull/11035 has been merged is this still an issue @CharlesDc9 ?

feat: add `pandera.io.to_pyarrow_schema`

will need to resolve the merge conflicts and probably rebase this onto the current `main` branch. @the-matt-morris not sure if you want to pick this up again. I do think...

feat: add `pandera.io.to_pyarrow_schema`

yeah, tried this out and I think the approach in this PR (i.e. a dedicated `pandera schema -> pyarrow schema` translation layer) is the way to go. This is because...

feat: add `pandera.io.to_pyarrow_schema`

The mapping approach is faster and simpler (it's O(1) since it's a lookup table). This would probably work for most of the the simple types. For things like lists and...

How to Avoid Pandera Doc Injection?

Thanks for bringing this up @kernelpernel, would it be possible to provide some screenshots and a minimally reproducible example? Don't really understand what you mean by docs being injected.

How to Avoid Pandera Doc Injection?

It's probably because of the `__new__` method: https://github.com/unionai-oss/pandera/blob/main/pandera/api/dataframe/model.py#L127-L132 Can you try overriding that method and seeing if it happens?

How to Avoid Pandera Doc Injection?

@kernelpernel any updates on this issue?

Support Series generation with serial dependence

Hi @NowanIlfideme Pandera strategies are currently quite limited, as you've experienced. The limitation is sort of bounded by the fact that it's leveraging the hypothesis `data_frames` API: https://hypothesis.readthedocs.io/en/latest/numpy.html#hypothesis.extra.pandas.data_frames. Basically, you...