metadata-qa-api icon indicating copy to clipboard operation
metadata-qa-api copied to clipboard

Ignore fields from the schema that are not in the data

Open mielvds opened this issue 5 years ago • 3 comments

At the moment, the code fails if a CSV contains a column that is not in the schema. Better behaviour would be to simply ignore these columns and post a warming. If none of the columns match the schema, the result should be empty, but should not throw an exception.

mielvds avatar Nov 02 '20 13:11 mielvds

Dear @mielvds, could you please write an example for this error? I am not able to reproduce it. I've created a new method which reads column names from the CSV header. See deatils in #58.

pkiraly avatar Nov 18 '20 15:11 pkiraly

TBH I opened this issue a bit to quickly. I'll see if I can reproduce... but I think #58 solves it indeed.

mielvds avatar Nov 18 '20 16:11 mielvds

@pkiraly I was able to reproduce this. The issue was that I set the CsvReader with the schema header like this:

this.calculator.setCsvReader(
                new CsvReader()
                        .setHeader(((CsvAwareSchema) schema).getHeader()));

Imagine you have a schema that configures the fields A, B,C, but your CSV contains the columns A,B,C, D. You'd get a java.lang.IllegalArgumentException: The size of columns are different than the size of headers when running calculator.measureAsList(strings)

mielvds avatar Dec 04 '20 17:12 mielvds