waimak
waimak copied to clipboard
Auto-detect schema change a recreate tables for HadoopConnectors
Expected Behavior
Auto-detect if a schema has changed for a table and recreate table accordingly, rather than requiring forceRecreateTables
flag.
Actual Behavior
The forceRecreateTables
must be set to true to recreate the tables, which sometime it is not and causes schema validation issues.
Some initial work here: https://github.com/CoxAutomotiveDataSolutions/waimak/tree/feature/auto-detect-schema-changes
I think the approach is wrong, maybe we should store schema information in a metadata directory in the output path. Something like $outputDir/.table_schemas/$tableName
.
That way we won't have to infer schema information.