hadoop-lzo icon indicating copy to clipboard operation
hadoop-lzo copied to clipboard

Combined LZO files

Open joey opened this issue 13 years ago • 3 comments

Would it be possible to add support combined LZO files? For example, if I compress two files and then concatenate the compressed versions, it'd be nice to be able to decompress the combined file get the contents of both files back out. The lzop program supports this.

joey avatar Jun 16 '11 15:06 joey

Out of curiosity, what is the current hadoop-lzo behavior?

aripollak avatar Oct 27 '11 21:10 aripollak

The current behavior is you only read back the contents of the first file.

joey avatar Oct 27 '11 21:10 joey

interesting. If lzop supports it, we should definitely look at it.

One complication is supporting splitting. Currently we always expect the header at the front of the file and read the index file to be able to start reading at an arbitrary offset. If we have mutliple concatinate files, we need to know where the lzo header lies for any given offset.

rangadi avatar Oct 28 '11 22:10 rangadi