Eliminate iterator allocations when writing CsvRecord

Open brunnsbe opened this issue 3 weeks ago • 0 comments

Is your feature request related to a problem? Please describe.

I'm using FastCSV to split massive CSV files (~1 billion rows) into multiple files by grouping rows by key columns. When reading with CsvRecord.getFields() and writing with CsvWriter.writeRecord(Iterable<String>), I observe ~1 TiB of iterator allocations (via JFR profiling).

The issue:

CsvRecord.getFields() wraps the internal array: Collections.unmodifiableList(Arrays.asList(fields))
CsvWriter.writeRecord(Iterable<String>) creates an iterator for each row
For 1 billion rows, this creates ~1 billion short-lived iterator objects

Describe the solution you'd like

Add a zero-allocation path by providing any one of these:

CsvWriter.writeRecord(CsvRecord record) - directly access internal fields array. With this we also can avoid the allocations that Collections.unmodifiableList(Arrays.asList(fields)) creates of Collections$UnmodifiableRandomAccessList and Arrays$ArrayList
CsvRecord.getFieldsArray() - expose internal array for use with writeRecord(String... values)
Smart iterator check - if Iterable is a List, use indexed access instead of iterator

Describe alternatives you've considered

Current workarounds:

record.toArray(new String[0]) - reduces allocation by ~50% but still allocates arrays
Indexed field-by-field writing with CsvWriterRecord - creates CsvWriterRecord objects instead
Reflection to access fields - works but fragile and ugly

Dec 07 '25 19:12 brunnsbe