pandera icon indicating copy to clipboard operation
pandera copied to clipboard

How to load schema from pyspark struct or avro format from schema registry ?

Open pthalasta opened this issue 10 months ago • 2 comments

Question about pandera

How do i create the DataFrameSchema using the avro schema? What are our options? If used, i see the DataFrameSchema object to have an empty column field. Can this be added as a feature that can help pull the schema from the registries that are most widely used?

pthalasta avatar Apr 24 '24 20:04 pthalasta

Hi @pthalasta looking at the avro schema docs it looks like we'll need to write a translation layer between avro -> pandera, similar to the frictionless integration: https://pandera.readthedocs.io/en/stable/frictionless.html?highlight=frictionless#frictionless-data-schema

Feel free to change the label of this issue to enhancement and re-write the title as a feature request.

Happy to review a PR contribution from you or someone in the community!

cosmicBboy avatar May 05 '24 19:05 cosmicBboy

@cosmicBboy i'm not sure i can edit the label of the issue, but i can certainly change the description. Please let me know if that helps

pthalasta avatar May 21 '24 17:05 pthalasta