dagger icon indicating copy to clipboard operation
dagger copied to clipboard

feat: Analyse different strategies and add validation for missing fields in parquet data compared to protobuf schema

Open Meghajit opened this issue 3 years ago • 1 comments

Extra fields present in the parquet data not present in the protobuf schema will be ignored. However, it might be possible that:

  • there are some fields in protobuf schema which are missing in the parquet data
  • field names are same but the data type is different

We would need answers to as well as solve for :

  1. Should the Parquet Data Source set default values for fields which are not found in the parquet file but present in the schema ? If yes, what should be the default value ?
  2. If no defaults are wanted to be set, should the Dagger job fail ?

Meghajit avatar Jan 20 '22 09:01 Meghajit

Removing this from Support for Parquet Files as a Source Milestone as it is a nice to have for the first milestone cc: @prakharmathur82

Meghajit avatar Jun 10 '22 05:06 Meghajit