jackson-dataformats-text Support parsing CSV with header regardless of unknown columns

When reading given CSV with jackson-dataformat-csv 2.11.4

name,weight,age
Roger,69,27
Chris,89,53

using following snippet

CsvMapper csvMapper = new CsvMapper();
CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true)
        .addColumn("name").addColumn("age").build();
List<Person> persons = csvMapper
        .readerFor(Person.class)
        .with(csvSchema)
        .<Person> readValues(csv)
        .readAll();
...
class Person {
    public String name;
    public int age;
}

a CsvMappingException is thrown (Too many entries: expected at most 2) because the column weight is not known to CsvSchema. csvMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false); still leads to the same CsvMappingException. Thus please introduce a new CsvParser feature e.g. IGNORE_UNKNOWN_COLUMNS (disabled by default) that allows reading CSV regardless of unknown columns.

Aug 26 '21 13:08 bjmi

Reorder the columns:

CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).setReorderColumns(true) .addColumn("name").addColumn("age").build();

or skip adding columns explicitly when using setUseHeader(true)

CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).build();

Sep 15 '21 20:09 kpankowski

Reorder the columns:

CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).setReorderColumns(true) .addColumn("name").addColumn("age").build();

But the use case expects the columns name and age in given order and should fail otherwise. At the moment explicitly declaring header columns and the reorder column feature are mutually exclusive due to this: https://github.com/FasterXML/jackson-dataformats-text/blob/810772312735f1fb89d6fa37dd70e150e9cc783b/csv/src/main/java/com/fasterxml/jackson/dataformat/csv/CsvParser.java#L787 and can be considered as a bug.

or skip adding columns explicitly when using setUseHeader(true) CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).build();

But then FAIL_ON_MISSING_COLUMNS feature can't be used anymore and name and age aren't required columns anymore.

Sep 16 '21 08:09 bjmi

Same issue was encountered with jackson-dataformat-csv 2.13.4, trying to parse a csv file(>100 columns) to a Java entity(10 attributes). I have tried to use

ObjectReader csvReader = csvMapper.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES) .readerFor(BlackList.class) .with(csvSchema);

But I have found that the values in the unknown columns are parsed to the next column, messed up data in the DB. As @bjmi mentioned, IGNORE_UNKNOWN_PROPERTIES will likely solve my problem

a CsvMappingException is thrown (Too many entries: expected at most 2) because the column weight is not known to CsvSchema. csvMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false); still leads to the same CsvMappingException. Thus please introduce a new CsvParser feature e.g. IGNORE_UNKNOWN_COLUMNS (disabled by default) that allows reading CSV regardless of unknown columns.

Apr 12 '23 03:04 ZijiePan1996

I can get it to work if when reading I use a schema .withHeader() and .withColumnReordering().

FAIL_ON_UNKNOWN_PROPERTIES is disabled for me, but I didn't test if it's necessary.

So in the end I am using two different schemas: for writing without column reordering and for reading with column reordering.

Jun 08 '23 15:06 redvasily