orc
orc copied to clipboard
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
https://github.com/microsoft/vcpkg/blob/master/versions/o-/orc.json
Unrelated to this PR: we have many weird `_xxx` variable names appearing in the function signature. I'm thinking to do something like apache/arrow, which only use `xxx_` as the name...
### What changes were proposed in this pull request? ### Why are the changes needed? ### How was this patch tested? ### Was this patch authored or co-authored using generative...
### What changes were proposed in this pull request? If the file in HDFS is in a completed state, avoid calling the HDFS getFileInfo RPC. Provide `orc.file.length.fast` configuration to enable...
### What changes were proposed in this pull request? ### Why are the changes needed? ### How was this patch tested? ### Was this patch authored or co-authored using generative...
Now we only support `FAST`, `DEFAULT `and `FASTEST`. We can support the maximum compression rate like `BEST_COMPRESSION`.
Looking through the ORC code we seem to have some members that are not used from within the ORC project. It is likely that some of these are being used...