compress
compress copied to clipboard
bzip2: merge into Go standard library
it hasn't been updated for a while, is it stable yet? if it is, hope it can merged upstream and fix https://github.com/golang/go/issues/4828
Hi, the version of bzip2.Writer in the development
branch is very stable has gone through many months of fuzzing comparing it's implementation with that of the C version. I have not merged the development
branch into master
yet, but plan to do so in the near future.
In regards to golang/go#4828, the main roadblock is finding someone to be a suitable reviewer for the implementation. I'm not aware of anyone else who has a deep knowledge of how bzip2 works. @mdempsky and @nigeltao have reviewed a number of my compression related CLs (showing knowledge in LZ77 and Huffman encoding), but I don't know if they're knowledgeable about BWT. It also depends on the availability of free time on my behalf and those of the reviewers.
Happy to learn about BWT, but I'm really low on free time for the forseeable future. Sorry.
Hey, @dsnet any progress with merging bzip2 compressor into standard library? What help do you need to make it or what are the blockers?
I am not currently working on merging it into the standard library. I would love to see that happen in the near future, but I am still blocked on finding appropriate reviewers.
The master
branch of this repo has a stable implementation that has gone through CPU-years of fuzz testing.
As an outside observer (i.e. someone who would like to use the library), does this really need a reviewer with deep knowledge of compressors? As a user of the library, all I really care about is that the outputs are compatible with other standard bzip2 implementations out there, and that it doesn't do anything crazy. Wouldn't a test suite would cover that?
Surely it's more important that there is a decent implementation available in the standard library now, rather than a potential future "best" implementation?
outputs are compatible with other standard bzip2 implementations out there
Review by a another person doesn't guarantee this, but history has shown that programmers are faulty and the number of extra eyeballs tend to decrease the occurrence of bugs.
Wouldn't a test suite would cover that?
History has also demonstrated that test suites are often sub-par at detecting correctness bugs.
Surely it's more important that there is a decent implementation available in the standard library now, rather than a potential future "best" implementation?
If this package has the needed functionality, why is its name or location important? If API stability is important, it can/should be vendored.
i.e. someone who would like to use the library
Is there something that's preventing the package from being used at its current location?
@shurcooL yes some people want only things in the base go distribution. The issue is about including/merging bzip2 in the standard library - see the original issue description.
@dsnet sure I get that and am all in favor of reviews but there's an argument to be made that something that seems to be working quite well is better than nothing, right?
@shurcooL
If this package has the needed functionality, why is its name or location important? If API stability is important, it can/should be vendored.
It seems that you're a big fan of the amazing vendoring feature of go, so can you please remove net/http, encoding, compress, image etc from the standard library? If somebody needs then, JUST vendoring them, so simple and easy! I just don't need them while keep the go tool distribution bloated on my disk, pretty annoying:
du -sch go1.10.1
357M total
Oh, yes there's go 1.0 compatibility promise, can you PLEASE do this in go2.0?
This is not the right place to discuss vendoring or whether the standard library should be split apart and versioned separately or even distributed. All of that is being discussed with the vgo proposal.
@Darren I see your point.
I expect/hope this issue will eventually be resolved, when the conditions for it are right.
Having put off work that needed a bzip2 compressor and only just recently having randomly stumbled across this, a strong argument for merging into the stdlib is the aspect of discoverability.
Has the path towards merging this into the standard library been walked any further along?