gap icon indicating copy to clipboard operation
gap copied to clipboard

gap-4.11.1: Is gzip compression for Input/OutputTextfile active?

Open kiryph opened this issue 4 years ago • 3 comments

As noted in CHANGES.md for gap-4.11.1:

Other fixed bugs

  • #3963 Provide automatic compression/decompression of filenames ending .gz (as is claimed for example in the documentation of InputTextFile)

Therefore, I assume that I can write gzipped files.

Documentation

  • InputTextFile(filename). [...] If filename ends in .gz and the file is a valid gzipped file, then the file will be transparently uncompressed.
  • OutputTextFile(filename, append) If filename ends in .gz, then the file will be written with gzip compression.

https://www.gap-system.org/Manuals/doc/ref/chap10.html#X80B5F2E4856D8980

Observed behaviour

$ gap
gap> dir := Directory(".");;
gap> fname := Filename(dir, "test.txt.gz");;
gap> stream := OutputTextFile( fname, false );;
gap> PrintTo( stream, "a very long line that GAP is going to wrap at 80 chars by default if we don't do anything about it\n");
gap> CloseStream(stream);
gap> quit;
$ file test.txt.gz
test.txt.gz: ASCII text
$ gzip --force test.txt.gz
$ file test.txt.gz.gz # gzip adds another .gz extension
test.txt.gz.gz: gzip compressed data, was "test.txt.gz", last modified: Fri May  7 13:02:38 2021, from Unix, original size modulo 2^32 99

Expected behaviour

$ gap
gap> dir := Directory(".");;
gap> fname := Filename(dir, "test.txt.gz");;
gap> stream := OutputTextFile( fname, false );;
gap> PrintTo( stream, "a very long line that GAP is going to wrap at 80 chars by default if we don't do anything about it\n");
gap> CloseStream(stream);
gap> quit;
$ file test.txt.gz
test.txt.gz: gzip compressed data

Copy and paste GAP banner

 ┌───────┐   GAP 4.11.1 of 2021-03-02
 │  GAP  │   https://www.gap-system.org
 └───────┘   Architecture: x86_64-apple-darwin19.6.0-default64-kv7
 Configuration:  gmp 6.2.1, GASMAN, readline
 Loading the library and packages ...
 Packages:   AClib 1.3.2, Alnuth 3.1.2, AtlasRep 2.1.0, AutoDoc 2020.08.11, AutPGrp 1.10.2, Browse 1.8.11, CaratInterface 2.3.3,
             CRISP 1.4.5, Cryst 4.1.23, CrystCat 1.1.9, CTblLib 1.3.1, FactInt 1.6.3, FGA 1.4.0, Forms 1.2.5, GAPDoc 1.6.4, genss 1.6.6,
             IO 4.7.0, IRREDSOL 1.4.1, LAGUNA 3.9.3, orb 4.8.3, Polenta 1.3.9, Polycyclic 2.16, PrimGrp 3.4.1, RadiRoot 2.8, recog 1.3.2,
             ResClasses 4.7.2, SmallGrp 1.4.2, Sophus 1.24, SpinSym 1.5.2, TomLib 1.2.9, TransGrp 3.0, utils 0.69
 Try '??help' for help. See also '?copyright', '?cite' and '?authors'

gap installed with https://github.com/gap-system/homebrew-gap

The comment in the test file https://github.com/gap-system/gap/blob/stable-4.11/tst/testinstall/compressed.tst#L20

Check file really is compressed (FIXME: disabled in stable-4.11)

indicates that in contrast to CHANGES.md compressing is currently disabled.

Can someone confirm this and clarify whether compression can be used?

kiryph avatar May 07 '21 13:05 kiryph

IIRC automatic compression was disabled because it lead to issues with code that did not expect it, and which eg tried to write already compressed data.

In the upcoming 4.12 there will be a OutputGzipFile

fingolfin avatar May 09 '21 10:05 fingolfin

@ChrisJefferson it seems we forgot to update the documentation for OutputTextFile, at least in stable-4.11 ?

fingolfin avatar May 09 '21 10:05 fingolfin

Thanks for the clarification. Update the commit https://github.com/gap-system/gap/commit/069a6497424113dc5e5ffe133be9da5bfe6acb24 on branch disabled compression for OutputTextFile

Updating the documentation for OutputTextFile would be great. But also CHANGES.md 4.11.1 could be clearer:

#3963 Provide automatic ~compression/~ decompression of filenames ending .gz (as is claimed ~~for example~~ in the documentation of InputTextFile).

kiryph avatar May 09 '21 10:05 kiryph

In PR #4989 the CHANGES entry is fixed as suggested by @kiryph . The documentation for OutputTextFile is adjusted at least in master / GAP 4.12 (to be released this week). It now says:

OutputTextFile( filename, append ) returns an output stream in the category IsOutputTextFile that writes received characters to the file filename. If append is false, then the file is emptied first, otherwise received characters are added at the end of the file. OutputGzipFile acts identically to OutputTextFile, except it compresses the output with gzip.

So I think once PR #4989 is merged we can close this

fingolfin avatar Aug 16 '22 09:08 fingolfin