parquet-format icon indicating copy to clipboard operation
parquet-format copied to clipboard

Support Int8 and Int16 as basic type

Open asfimport opened this issue 3 years ago • 2 comments

 Int8 and Int16 are not supported as basic in previos version. Using 4 bytes to store int8 seems not a good idea, which means requiring more storage and read and write very slow. Besides, it is not friendly with regular computing format, such as velox, arrow, vector and so on.

With Int8 and Int16 supported, we can get less storage and better performance on reading and writing. As for forward compatible, we can use version in FileMetaData to choose how to read parquet data.

Reporter: Jackey Lee / @jackylee-ch

Note: This issue was originally created as PARQUET-2133. Please see the migration documentation for further details.

asfimport avatar Mar 01 '22 04:03 asfimport

Timothy Miller / @theosib-amazon: Have you started working on implementing this? What is your progress. I'd be happy to work with you on it.

asfimport avatar Apr 08 '22 16:04 asfimport

Micah Kornfield / @emkornfield: before we start working on it it should probably be discussed on the dev@ mailing list to make sure people are OK with the specification change.

asfimport avatar Apr 08 '22 17:04 asfimport