alibabacloud-jindodata
alibabacloud-jindodata copied to clipboard
希望增加对阿里云日志服务(SLS)投递到OSS的snappy压缩文档的支持
目前对hadoop-snappy的支持是正常的。 SLS投递到OSS的snappy压缩文档好像不是 hadoop-snappy ; 二进制对比发现SLS投递的snappy文档头部比正常的hadoop-snappy少了几个字节,snzip 工具需要添加参数 -t raw才能正常解压缩。 阿里云自己的生态链下,对这种格式添加支持应该是比较合理的。
附snzip supported formats列表:
snzip 1.0.4
Usage: snzip [option ...] [file ...]
general options:
-c output to standard output, keep original files unchanged
-d decompress
-k keep (don't delete) input files
-t name file format name. see below. The default format is framing2.
-h give this help
raw_format option:
-s size size of input data when compressing.
The default value is the file size if available.
tuning options:
-b num internal block size in bytes
-B num internal block size. 'num'-th power of two.
-R num size of read buffer in bytes
-W num size of write buffer in bytes
-T trace for debug
supported formats:
NAME SUFFIX URL
---- ------ ---
framing2 sz https://github.com/google/snappy/blob/master/framing_format.txt
hadoop-snappy snappy https://code.google.com/p/hadoop-snappy/
raw raw https://github.com/google/snappy/blob/master/format_description.txt
iwa iwa https://github.com/obriensp/iWorkFileFormat/blob/master/Docs/index.md#snappy-compression
framing sz https://github.com/google/snappy/blob/0755c815197dacc77d8971ae917c86d7aa96bf8e/framing_format.txt
snzip snz https://github.com/kubo/snzip
snappy-java snappy https://github.com/xerial/snappy-java
snappy-in-java snappy https://github.com/dain/snappy
comment-43 snappy http://code.google.com/p/snappy/issues/detail?id=34#c43
这个和jindofs关系不大,使用emr-hadoop可以解决你的问题
》阿里云自己的生态链下,对这种格式添加支持应该是比较合理的。
这个能 clarify 一下吗?比如具体需要 JindoFS SDK 对 OSS 这部分格式数据提供什么样的支持?