lz4 icon indicating copy to clipboard operation
lz4 copied to clipboard

unable to decompress multiple frames

Open amalone-scwx opened this issue 3 years ago • 4 comments

Below is a code sample to demonstrate the issue. The issue happens when multiple LZ4 frames are in the data. The multiBlockB64 string contains 186000 bytes of highly redundant data with LZ4 in 64KB independent blocks, each in it's own frame. I generated it with c++ library. If I base64 decode it and use lz4c to decompress, it sees the entire 186000 bytes. However, the Go sample using this library will only see the first block. No errors reported. If I change the block size on compression to 256KB, then the Go sample will see up to 256KB.

% lz4c -v -t /tmp/multiblock.lz4   
*** LZ4 command line interface 64-bits v1.9.3, by Yann Collet ***
/tmp/multiblock.lz4  : decoded 186000 bytes
package main

import (
	"encoding/base64"
	"fmt"
	"io"
	"io/ioutil"

	"github.com/pierrec/lz4/v4"
)

const multiBlockB64 string = "BCJNGGBAgkkBAAD/L2FiY2RlZmdoaWprbG1ub3BxcnN0dXZ3eHl6MDEyMzQ1Njc4OUFCQ0RFRkdISUpLTE1OT1BRUlNUVVZXWFlaPgD/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////qFBWV1hZWgAAAAAEIk0YYECCSQEAAP8vYWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXowMTIzNDU2Nzg5QUJDREVGR0hJSktMTU5PUFFSU1RVVldYWVo+AP////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////+oUFZXWFlaAAAAAAQiTRhgQIIgAQAA/y9hYmNkZWZnaGlqa2xtbm9wcXJzdHV2d3h5ejAxMjM0NTY3ODlBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWj4A//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////8VUFZXWFlaAAAAAA=="

func lz4Decode(payload []byte) ([]byte, error) {
    r, w := io.Pipe()

    defer r.Close()
    defer w.Close()

    go func() {
        if _, err := w.Write(payload); err != nil{
            fmt.Println("ERROR: unable to write to lz4 stream", err)
            return
        }
        w.Close()
    }()

    zr := lz4.NewReader(r)
    body, err := ioutil.ReadAll(zr)
    if err != nil {
        return nil, err
    }
    return body, nil
}


func main() {
	body, err := base64.StdEncoding.DecodeString(multiBlockB64)
	if err != nil {
		fmt.Println("unable to base64 decode")
		return
	}

	dec,err := lz4Decode(body)
	if err != nil {
		fmt.Println("decode failed", err)
		return
	}

	fmt.Println("decoded size", len(dec))
}

amalone-scwx avatar Sep 13 '21 16:09 amalone-scwx

It should be noted that using the old version import "github.com/pierrec/lz4" , the sample code is able to handle multiple blocks and see entire contents.

amalone-scwx avatar Sep 13 '21 16:09 amalone-scwx

I bumped into the same issue with the v4 branch. It doesn't happen with the v2 branch of the library as @amalone-scwx mentioned. @pierrec , is it expected from now on?

phacops avatar Sep 18 '21 21:09 phacops

@phacops nope it is a bug. I havent had time to look into it yet.

pierrec avatar Sep 19 '21 06:09 pierrec

No problem, thanks for the answer. I'll try to investigate myself when I have time.

phacops avatar Sep 20 '21 16:09 phacops