pyjanitor icon indicating copy to clipboard operation
pyjanitor copied to clipboard

[ENH] Incorporate typeguard as a type checker into pyjanitor

Open szuckerman opened this issue 6 years ago • 6 comments

I was always bothered by the fact that even though we add type hints to the function arguments, we still need to use that check function to validate the data.

I created a new package called annotation_validation that automatically validates input and output data based on the type hints.

Best of all, it's just a decorator that gets added to the functions.

It's essentially a fork of this blog post but I've added type checking for return values and Union types.

There's still more to add, but I'm thinking this might be good to start adding to new functions and removing the check function (since having both leaves more room for error).

szuckerman avatar Feb 04 '19 19:02 szuckerman

Nice stuff, @szuckerman! That'd be great. Do you have a timeline for release and re-integration into pyjanitor?

ericmjl avatar Feb 05 '19 14:02 ericmjl

I was running some benchmarks after I posted this yesterday, and there appears to be a bit too much overhead with how I'm comparing arguments and their types. I have some ideas for caching that will reduce the load.

In any event, I would like to try to integrate this in the next few weeks.

szuckerman avatar Feb 05 '19 15:02 szuckerman

This is fantastic. I love type hinting & in some cases, I'm definitely interested in guaranteeing adherence to them in my other projects, as well!

zbarry avatar Feb 12 '19 18:02 zbarry

So, I was doing some work to fix up my package (issues with Python 3.7) and found that someone already made this package anyway 🤷‍♂

https://github.com/agronholm/typeguard

szuckerman avatar May 24 '19 18:05 szuckerman

Haha, my dad once told me, if we have a good idea, someone else probably has already implemented it.

The comfort I took from that statement is that "it's a good idea!" :smile:

I guess I can change this issue title to, "incorporate typeguard to perform type checking"? It's worth a test - the typechecking we have here provides informative error messages. I wonder if we can provide those error messages as well with typeguard?

ericmjl avatar May 24 '19 18:05 ericmjl

The error messages from typeguard are such:

For a function that has an argument first that should be a list, but instead used a tuple: TypeError: type of argument "first" must be list; got tuple instead

For return values: TypeError: type of the return value must be int; got str instead (won't matter much since everything's returning a DataFrame)

szuckerman avatar May 24 '19 18:05 szuckerman