pandera icon indicating copy to clipboard operation
pandera copied to clipboard

Checks at the DataFrameSchema-level in case lambda return Series of Booleans or boolean with element_wise=True.

Open itaisir opened this issue 2 years ago • 2 comments

When I define a DataFrameSchema-level lambda which returns Series of Booleans it validates the df even if the the returned Series of Booleans not all True.


def parse_date(text):
        for fmt in ('%Y-%m-%d', '%d.%m.%Y', '%d/%m/%Y'):
            try:
                return datetime.strptime(text, fmt)
            except ValueError:
                pass
        return None

def validate_start_before_end(df):
        result = []
        counter = -1 
        for row in df['Start date']:
            counter +=1
            if not pd.isnull(df['End date'][counter]) and parse_date(df['Start date'][counter]) <= parse_date(df['End date'][counter]):
                result.append(True)
            else:
                result.append(False)
        return pd.Series(result)
schema = pa.DataFrameSchema(
        columns={
            'Start date': pa.Column(str,pa.Check(lambda s: UploadHeadCountListPerCompany.validate_dates(s),error="Wrong date format"), nullable=False),
            'End date': pa.Column(str,pa.Check(lambda s: UploadHeadCountListPerCompany.validate_dates(s),error="Wrong date format"), coerce=True, nullable=True),

        },
        # define checks at the DataFrameSchema-level
        checks=pa.Check( lambda df:  UploadHeadCountListPerCompany.validate_start_before_end(df),error='Start Date should be before End Date',element_wise=False)
        )



validated_df = self.schema.validate(reader)  # <-- This line should raise Exception
image image image

itaisir avatar Mar 16 '22 11:03 itaisir

I set the element_wise=True and changed the function but it also not working

def validate_start_before_end(df):
        if not pd.isnull(df['End date']) and UploadHeadCountListPerCompany.parse_date(df['Start date']) <= UploadHeadCountListPerCompany.parse_date(df['End date']):
            return True
        else:
            return False

itaisir avatar Mar 16 '22 11:03 itaisir

can you provide a minimally reproducible (copy-pasteable) example with toy data?

cosmicBboy avatar Mar 16 '22 13:03 cosmicBboy