parquet4s icon indicating copy to clipboard operation
parquet4s copied to clipboard

Support for custom metadata?

Open vreuter opened this issue 2 years ago • 3 comments

Hello,

Is there any effort underway (or interest in) to allow users to write additional file-level metadata, e.g. for provenance of how data contained were generated? Something like for the library's own declaration? https://github.com/mjakubowski84/parquet4s/blob/35d2c11335a1af42e8b139918f5438ff0d167718/core/src/main/scala/com/github/mjakubowski84/parquet4s/ParquetWriter.scala#L49

Sorry if I missed this in docs or looking around the code, but I didn't see it available.

Thank you!

vreuter avatar Jan 26 '22 16:01 vreuter

No, there's no effort planned. Feel free to propose any change, PRs are welcome.

FYI, besides that "signature" metadata contains schema name which is either class name or "parquet4s_schema" if class name cannot be resolved: https://github.com/mjakubowski84/parquet4s/blob/master/core/src/main/scala/com/github/mjakubowski84/parquet4s/Schema.scala#L22

mjakubowski84 avatar Jan 26 '22 16:01 mjakubowski84

@mjakubowski84 I haven't yet made time to look at what this would need, but I am interested. Will let you know if I start to work on it.

vreuter avatar Feb 07 '22 05:02 vreuter

Great! Looking forward to your contribution!

mjakubowski84 avatar Feb 07 '22 12:02 mjakubowski84