scio
scio copied to clipboard
Add support for Zstd coders
Adds
-
saveAsZstdDictionary
to train a Zstd dictionary on some arbitrarySCollection[T]
. Estimates the average size of elementsT
, collectsn
elements based on a target training set size, then trains and saves the Zstd dictionary. - A scala
ZstdCoder
object with transform Coders for the simpleT
or for each side of a(K, V)
- command line argument to map from a type to a dictionary, causing instances of
MyClass
to get Zstd compression automagically. Probably fails if the type is parameterized.--zstdDictionary=com.spotify.scio.MyClass:gs://bucket/path/dict.bin