hadoop-lzo
hadoop-lzo copied to clipboard
Combined LZO files
Would it be possible to add support combined LZO files? For example, if I compress two files and then concatenate the compressed versions, it'd be nice to be able to decompress the combined file get the contents of both files back out. The lzop program supports this.
Out of curiosity, what is the current hadoop-lzo behavior?
The current behavior is you only read back the contents of the first file.
interesting. If lzop supports it, we should definitely look at it.
One complication is supporting splitting. Currently we always expect the header at the front of the file and read the index file to be able to start reading at an arbitrary offset. If we have mutliple concatinate files, we need to know where the lzo header lies for any given offset.