go-lz4 icon indicating copy to clipboard operation
go-lz4 copied to clipboard

Use capacity to determine whether dst can be reused for encoding and decoding

Open manishrjain opened this issue 7 years ago • 5 comments

Also, modify the benchmarks to reuse the allocated slice. This improves the ns/op for both encoding and decoding by 0.4s/op (2.4s -> 2.0s for Decode, and similar for encode).

manishrjain avatar May 09 '17 03:05 manishrjain

Hi, is this repository being maintained actively? I have this PR and seeing more performance issues which need to be fixed.

manishrjain avatar May 17 '17 08:05 manishrjain

I'm not sure using cap here actually makes sense from an API point of view, although I agree it does from a performance perspective. The space between len() and cap() exists in this weird ownership space. As a caller, I might pass a sub-slice to a function saying "This is what you're allowed to use; if you need more, get your own." If the function checks cap(), we can't do. this.

For reference, note that the Go snappy port checks only len(): https://github.com/golang/snappy/blob/master/decode.go#L60

dgryski avatar May 17 '17 09:05 dgryski

Besides, this is really only a thing here due to how the benchmarks are written. A caller can get the same behavior by doing dst, err = Encode(dst[:cap(dst)], testfile) anyway, no?

calmh avatar May 17 '17 09:05 calmh

@dgryski : I see your point. My perspective it this:

A caller of this API would only need to pass a dst slice, if they want to avoid memory allocations; which in Go can be very expensive (considering GC etc.). The typical usage then would be: dst = Decompress(dst, src). When using len, dst's size would change all the time, causing Decompress to allocate slices way more frequently; which was not the intent of the caller.

I can't think of a scenario where you'd pass something in dst, not want the function to exceed the length of it and yet be OK when it returns you a whole new slice.

In fact, when I first realized what it was actually doing, it was a surprise, because I allocated a big enough slice, and just expected the function to keep reusing it.

That's my opinion. Though, it might not be idiosyncratic Go; and I defer to you to make that decision. Happy to change the way we call the API. But, at the very least, it would be good to update the comments to specify what @calmh has suggested.

manishrjain avatar May 17 '17 10:05 manishrjain

Let me know if I should retract this PR.

manishrjain avatar May 17 '17 21:05 manishrjain