aim icon indicating copy to clipboard operation
aim copied to clipboard

Space-efficient storage: compress LOG and *.sst for text data

Open YodaEmbedding opened this issue 3 years ago • 1 comments

🚀 Feature

More space-efficient storage by compressing LOG and storing compressed text data in *.sst files.

Motivation

The LOG file contains text data for aim debugging, and isn't really needed to be viewed by the average user. Perhaps it should be stored and flushed into some compressed container by default.

I use catalyst, which uses tqdm to display training progress. This unfortunately results in big log. I assume that this is what occupies 60MB of space in one of the *.sst files. I am only logging a few metrics per-epoch, otherwise, so they should only take a few KB at most.

 58M 011941.sst
265K 011943.sst
265K 011945.sst
262K 011947.sst
260K 011949.sst
262K 011951.sst
261K 011953.sst
267K 011955.sst
269K 011957.sst
   0 011959.log
 31K 011960.sst
  16 CURRENT
  36 IDENTITY
   0 LOCK
 35M LOG
1.6M MANIFEST-000004
6.4K OPTIONS-000007

Pitch

Compress text data at some point using a standard lossless compressor (e.g. fast-and-furious zstd or regular zlib).

Alternatives

N/A

Additional context

N/A

YodaEmbedding avatar Sep 21 '22 04:09 YodaEmbedding

@mahnerak could you take a look at this?

gorarakelyan avatar Sep 21 '22 10:09 gorarakelyan