parquet-go
parquet-go copied to clipboard
Dynamically defined parquet schema
With the recent PR merge enabling the use of map[string]interface{}, the remaining challenge is the schema. Does anyone have recommendations on how to create a parquet schema with nested fields dynamically? I've seen implementations like dataframe-go that use github.com/ompluscator/dynamic-struct to dynamically create a struct, also looked into dynamically building a JSON schema, both seem rather cumbersome if there are more than a handful of data types and many/complex nested fields to handle. I was hoping for something simpler like what's alluded to by stephane-moreau in
I am chiming in here. I am looking for a way to define a schema without using arrow
. I ended up writing a dynamic CSV schema. It works until one hits nested types.
@sdressler I've written a package that allows you to generate an Arrow schema from an arbitrary map[string]interface{}, maybe it's helpful. github.com/loicalleyne/arrow_schemagen