PandasSchema
PandasSchema copied to clipboard
column mismatch wrong exception raised
lines 59 to 64 in schema.py. code checks if file columns are subset of self.get_column_names() but in Exception printing difference between columns and self.columns so it is always shows that all Columns are different
if set(columns).issubset(self.get_column_names()):
columns_to_pair = [column for column in self.columns if column.name in columns]
else:
raise PanSchArgumentError(
'Columns {} passed in are not part of the schema'.format(set(columns).difference(self.columns))
)
Probably related to the issue above. In my case schema.validate(test_data) always returns Invalid number of columns. The schema specifies 21, but the data frame has 22 even thought test_data actually has 21 columns.
Hi, thanks for the report. In the interests of time could either of you provide a reproducible example please? Preferably pure Python code.
Probably related to the issue above. In my case
schema.validate(test_data)always returnsInvalid number of columns. The schema specifies 21, but the data frame has 22even thoughttest_dataactually has 21 columns.
yes that exactly what is the issue. I will write a sample code and will post it later today