pandera
pandera copied to clipboard
How to load schema from pyspark struct or avro format from schema registry ?
Question about pandera
How do i create the DataFrameSchema
using the avro schema? What are our options? If used, i see the DataFrameSchema
object to have an empty column field. Can this be added as a feature that can help pull the schema from the registries that are most widely used?
Hi @pthalasta looking at the avro schema docs it looks like we'll need to write a translation layer between avro -> pandera
, similar to the frictionless integration: https://pandera.readthedocs.io/en/stable/frictionless.html?highlight=frictionless#frictionless-data-schema
Feel free to change the label of this issue to enhancement
and re-write the title as a feature request.
Happy to review a PR contribution from you or someone in the community!
@cosmicBboy i'm not sure i can edit the label of the issue, but i can certainly change the description. Please let me know if that helps