pandera icon indicating copy to clipboard operation
pandera copied to clipboard

Parameterized field names

Open kylejcaron opened this issue 5 months ago • 0 comments

Question about pandera

Note: If you'd still like to submit a question, please read this guide detailing how to provide the necessary information for us to reproduce your question.

Is there a way to parameterize the field names? For example, If I'm making a schema to check if data is a panel dataset, I'd like the entity_id column name to be parameterize-able

Here's an example of the dataframe model

# Your code here, if applicable
class PanelSchema(pa.DataFrameModel):

    entity_col: Series[str] = pa.Field(coerce=True, nullable=False)
    date: DateTime = pa.Field(coerce=True, nullable=False)

    class Config:
        unique = ["entity_col", "date"]
        strict = False
        metadata: dict = {}

and here's how I'd like to use it, although open to other patterns that accomplish the same thing.

PanelSchema(entity_col = 'customer_id').validate(data)

As an added question, would Pandera be open to a contrib module? I think an inheritable PanelSchema would be helpful for alot of use cases. For example, multivariate time series, discrete time survival analysis, and cohort datasets can all be framed as panel datasets

kylejcaron avatar Aug 27 '24 21:08 kylejcaron