snappy-java issues

SnappyOutputStream write(arr, offset, length) should check length is less than the array size or 31-bit integer limit

There should be an assertion so that length should not exceed the int size limit.

bug

Update bitshuffle library to 0.5.x

1

https://github.com/kiyo-masui/bitshuffle 0.5.1 is the latest version, but due to API incompatibilities and various build failures when using cross compilers, it's been difficult to change the bitshuffle version to use. The...

xerial

help wanted

Introduce code formatter

xerial

help wanted

BitShuffle#shuffle for arrays of primitive datatypes does not consider the byte-order (leading to wrong results)

1

The underlying BitShuffle library works internally with an algorithm that was designed for the little-endian format. For this reason, the input data must always be passed in little-endian format. However,...

foosmate

BitShuffle#shuffle does not allow to pass arbitrary type sizes (even if the underlying library does support that)

1

For data encoding, sometimes integers with intermediate sizes (e.g. of 3 or 5 bytes) are used (e.g. to reduce bandwith). These arrays currently cannot be passed to the external BitShuffle...

foosmate

help wanted

Bundle packaging type usage breaks maven site due to missing bundle plugin from pom

3

Full details here: https://stackoverflow.com/questions/51069767/maven-unknown-packaging-bundle-error-from-a-dependency-packaging-as-bundle

astubbs

Add support for non-direct ByteBuffers

Currently (v1.1.8.4) snappy-java fails to use non-direct ByteBuffers with the following exception thrown: ``` org.xerial.snappy.SnappyError: [NOT_A_DIRECT_BUFFER] input is not a direct buffer at org.xerial.snappy.Snappy.compress(Snappy.java:141) ```

nehaev

help wanted

Compression ratio degraded for repeated INT64 columns in parquet

1

When testing upgrade to spark 3.1.1 I've noticed the compression of repeated INT64 columns compression got worse. https://stackoverflow.com/questions/67413589/parquet-compression-degradation-when-upgrading-spark/67455721#67455721 Reading [this file](https://drive.google.com/file/d/1FZx_qAmoX1HDpAVplvFnl2iC83siOCTE/view?usp=sharing) saved with snappy 1.1.2.6, and writing it with higher...

liorchaga

help wanted

Question: java.lang.NegativeArraySizeException for Large files.

I had this error while I tried to decompress a large file. Size of sample.snappy is 10 GB. Is there a way around it. Code source : [Source](https://partners-intl.aliyun.com/help/doc-detail/108942.htm) ``` String...

AbhishekPowar

help wanted

Add Buffer Management to SnappyInput Stream

1

SnappyOutputStream uses a buffer management system to reduce memory pressure and GC overhead. The same mechanism would be helpful in SnappyInputStream as well. We want to use Snappy for protocol...

chietti

help wanted

snappy-java
snappy-java copied to clipboard

Metadata

SnappyOutputStream write(arr, offset, length) should check length is less than the array size or 31-bit integer limit

Update bitshuffle library to 0.5.x

Introduce code formatter

BitShuffle#shuffle for arrays of primitive datatypes does not consider the byte-order (leading to wrong results)

BitShuffle#shuffle does not allow to pass arbitrary type sizes (even if the underlying library does support that)

Bundle packaging type usage breaks maven site due to missing bundle plugin from pom

Add support for non-direct ByteBuffers

Compression ratio degraded for repeated INT64 columns in parquet

Question: java.lang.NegativeArraySizeException for Large files.

Add Buffer Management to SnappyInput Stream

← Metadata

Owner

Metadata

snappy-java snappy-java copied to clipboard

Metadata

← Metadata

Owner

Metadata

snappy-java
snappy-java copied to clipboard