pandera icon indicating copy to clipboard operation
pandera copied to clipboard

serialization support for Polars and Spark

Open sergun opened this issue 9 months ago • 1 comments

Hi @cosmicBboy and all contributors to this great framework,

I was wondering if there are any plans to implement schema serialization for the Polars and Spark engines.

Thanks!

sergun avatar Feb 21 '25 13:02 sergun

there are now! 😀

just relabeld this as an enhancement.

For anyone in the community who wants to implement this, this has my blessing!

Basically they would need to:

  • implement spark_io.py and polars_io.py modules here
  • rewrite the call site that's currently in the generic DataFrameSchema type: https://github.com/unionai-oss/pandera/blob/main/pandera/api/dataframe/container.py#L1243-L1254
  • implement the to_json, to_yaml, from_json, to_yaml methods in the pandas/polars/pyspark-specific DataFrameSchema classes.

cosmicBboy avatar Feb 21 '25 14:02 cosmicBboy