parquet-java icon indicating copy to clipboard operation
parquet-java copied to clipboard

[C++] Stream API: Add support for repeated fields

Open asfimport opened this issue 6 years ago • 4 comments

The parquet::StreamReader and parquet::StreamWriter classes currently only support required fields.

Support must be added to this API in order for it to be usable when the schema has repeated fields.

 

Reporter: Gawain BOLTON / @gawain-bolton Assignee: Gawain BOLTON / @gawain-bolton

Related issues:

Note: This issue was originally created as PARQUET-1700. Please see the migration documentation for further details.

asfimport avatar Nov 23 '19 21:11 asfimport

Wes McKinney / @wesm: Might make sense to think about the API for groups (structs) while also thinking about repeated fields

asfimport avatar Nov 24 '19 00:11 asfimport

Gawain BOLTON / @gawain-bolton: I thought Parquet only supported basic data types?

I will change this ticket to be for repeated fields and create a separate Jira ticket for optional fields as I will have something ready soon to handle these.

asfimport avatar Nov 24 '19 15:11 asfimport

Wes McKinney / @wesm: Parquet indeed supports both structs and arrays/lists

asfimport avatar Nov 24 '19 20:11 asfimport

Micah Kornfield / @emkornfield: it might be nice to have something reusable between Arrow and Parquet for structs (I think at some point we want to add a similar row level iterator API for arrow Tables/RowBatches).

asfimport avatar Nov 26 '19 05:11 asfimport