pandantic icon indicating copy to clipboard operation
pandantic copied to clipboard

Optional columns in the dataframe

Open josedaniel-escribano-clarity opened this issue 1 year ago • 4 comments

Hello! I'm doing some testing to this library (looks promising) but I found out that If i want an optional column in the dataframe the validation fails.

Imagine a define an schema with two columns: A and B but I want to validate a dataset that contains both columns but another one that contains only column A.

Right now, even if I set the column as optional, pandantic expects the column to exist in the dataframe, if the column is missing it raises an error.

Do you plan to implement something like this in the near future? And what about the opposite, complain if there are columns that are not defined in the schema?

Thanks!

Hi @josedaniel-escribano-clarity ,

Thank you for your interest. Did you seen this in the README? That should work, only a different import is needed to specify the Optional type.

Cheers,

Wessel

wesselhuising avatar Nov 30 '23 08:11 wesselhuising

Think i tried with optional and still had the columns filled with Nones :(

I wrote some working tests which you can find here around using the pandantic.Optional type in your schema. Do you have an example of the pandantic.Optional type not behaving as expected?

wesselhuising avatar Nov 30 '23 09:11 wesselhuising

I think it only works for numeric types. I modified to if isinstance(x, float) and math.isnan(x): return None and got the expected results for my Optional[str]

rdlizenby avatar Jul 03 '24 00:07 rdlizenby

@josedaniel-escribano-clarity we are working on a major refactor and set of improvements!

@wesselhuising seems like this has been fixed, I am going to close for now, but if it is still an issue feel free to re-open and tackle in your refactor of the core library.

xaviernogueira avatar Aug 27 '24 16:08 xaviernogueira