pandera
pandera copied to clipboard
Replace instances of pa.PandasDtype enum with python, numpy, pandas types
Documentation problem
The current documentation demonstrates pandera
usage by using the pa.PandasDtype
enum, which can make things look a little unfamiliar to new-comers, especially since it now supports the use of python types and numpy scalar types, for example, see:
- documentation example: https://pandera.readthedocs.io/en/stable/#quick-start
- docstrings example: https://github.com/pandera-dev/pandera/blob/master/pandera/schemas.py#L106-L113
Suggested fix for documentation
We should change the documentation and docstrings to reflect ease-of-use and adoption by replace instances of the pa.PandasDtype
enum in this order of priority:
- python types, e.g.
int
,str
,float
, etc - numpy types, e.g.
np.int64
, etc - pandas types, e.g.
pd.StringDtype()
Can I do take this?
- https://github.com/pandera-dev/pandera/blob/master/pandera/schemas.py#L110-L115
Things should all be updated in the PR, I did not edit these pa.DataDtypes as I wasn't sure what the best replacement should be. Please let me know if these should also be changed.
"pa.DateTime" https://github.com/pandera-dev/pandera/blob/master/pandera/schemas.py#L114 & https://github.com/pandera-dev/pandera/blob/master/docs/source/lazy_validation.rst#L64
"pa.Category" https://github.com/pandera-dev/pandera/blob/master/docs/source/dataframe_schemas.rst#L624
cool, thanks! we can leave pa.DateTime
and pa.Category
as-is for now.
I think this is closed by #581 (c.f. #496).