pandera icon indicating copy to clipboard operation
pandera copied to clipboard

Replace instances of pa.PandasDtype enum with python, numpy, pandas types

Open cosmicBboy opened this issue 4 years ago • 4 comments

Documentation problem

The current documentation demonstrates pandera usage by using the pa.PandasDtype enum, which can make things look a little unfamiliar to new-comers, especially since it now supports the use of python types and numpy scalar types, for example, see:

  • documentation example: https://pandera.readthedocs.io/en/stable/#quick-start
  • docstrings example: https://github.com/pandera-dev/pandera/blob/master/pandera/schemas.py#L106-L113

Suggested fix for documentation

We should change the documentation and docstrings to reflect ease-of-use and adoption by replace instances of the pa.PandasDtype enum in this order of priority:

  • python types, e.g. int, str, float, etc
  • numpy types, e.g. np.int64, etc
  • pandas types, e.g. pd.StringDtype()

cosmicBboy avatar Oct 28 '20 01:10 cosmicBboy

Can I do take this?

benkeesey avatar May 16 '21 17:05 benkeesey

  • https://github.com/pandera-dev/pandera/blob/master/pandera/schemas.py#L110-L115

cosmicBboy avatar May 16 '21 18:05 cosmicBboy

Things should all be updated in the PR, I did not edit these pa.DataDtypes as I wasn't sure what the best replacement should be. Please let me know if these should also be changed.

"pa.DateTime" https://github.com/pandera-dev/pandera/blob/master/pandera/schemas.py#L114 & https://github.com/pandera-dev/pandera/blob/master/docs/source/lazy_validation.rst#L64

"pa.Category" https://github.com/pandera-dev/pandera/blob/master/docs/source/dataframe_schemas.rst#L624

benkeesey avatar May 16 '21 19:05 benkeesey

cool, thanks! we can leave pa.DateTime and pa.Category as-is for now.

cosmicBboy avatar May 16 '21 19:05 cosmicBboy

I think this is closed by #581 (c.f. #496).

nathanjmcdougall avatar Aug 07 '23 03:08 nathanjmcdougall