orc icon indicating copy to clipboard operation
orc copied to clipboard

Apache ORC - the smallest, fastest columnar storage for Hadoop workloads

Results 98 orc issues
Sort by recently updated
recently updated
newest added

### What changes were proposed in this pull request? Add copyright messaging to BpackingAvx512.hh ### Why are the changes needed? The vector unpacking functions in this PR[ https://github.com/apache/orc/pull/1375]() is derived...

CPP

### What changes were proposed in this pull request? `ParserUtils` removes empty check ### Why are the changes needed? https://github.com/apache/spark/pull/35253#issuecomment-1321866542 ```java java.lang.IllegalArgumentException: Empty quoted field name at 'struct'     at org.apache.orc.impl.ParserUtils.parseName(ParserUtils.java:114)    ...

JAVA

**Background info:** In a spark project, we are using orc c++ as acceleration lib to access hdfs files, comparing to original spark table scan with java/scala code. We found some...

When writing a file with a string column and multiple row groups, the resulting file has incorrect row index streams. The string column is encoded using direct encoding. The file...

The comments in ColumnVector indicate that we should only use the `isNull` array to determine null values when `noNulls` is set to false. This helps us reuse `ColumnVector`, for example,...

In Java and C++ reader, we cannot read the orc file with statistics exceed 2GB. We should find a new way or design to support read these files. ``` com.google.protobuf.InvalidProtocolBufferException:...

### What changes were proposed in this pull request? This PR aims to write parquet decimal type data in Benchmark using `FIXED_LEN_BYTE_ARRAY` type. ### Why are the changes needed? Because...

JAVA