DeepZip
                                
                                
                                
                                    DeepZip copied to clipboard
                            
                            
                            
                        NN based lossless compression
DeepZip
Update: Please checkout our new work DZip presented at DCC 2021.
Description
Data compression using neural networks
DeepZip: Lossless Data Compression using Recurrent Neural Networks
Requirements
- GPU, nvidia-docker (or try alternative installation)
 - python 2/3
 - numpy
 - sklearn
 - keras 2.2.2
 - tensorflow (cpu/gpu) 1.8
 
(nvidia-docker is currently required to run the code) A simple way to install and run is to use the docker files provided:
cd docker
make bash BACKEND=tensorflow GPU=0 DATA=/path/to/data/
Alternative Installation
cd DeepZip
python3 -m venv tf
source tf/bin/activate
bash install.sh
Code
To run a compression experiment:
Data Preparation
- Place all the data to be compressed in data/files_to_be_compressed
 - Run the parser
 
cd data
./run_parser.sh
Running models
- All the models are listed in models.py
 - Pick a model, to run compression experiment on all the data files in the data/files_to_be_compressed directory
 
cd src
./run_experiments.sh biLSTM GPUID
Note: GPUID by default can be set to 0. The corresponding command would be then ./run_experiments.sh biLSTM 0
Please cite if you utilize the code in this repository.
@inproceedings{7fcb664b03ac4d6497048954d756b91f,
title = "DeepZip: Lossless Data Compression Using Recurrent Neural Networks",
author = "Mohit Goyal and Kedar Tatwawadi and Shubham Chandak and Idoia Ochoa",
year = "2019",
month = "5",
day = "10",
doi = "10.1109/DCC.2019.00087",
language = "English (US)",
series = "Data Compression Conference Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
editor = "Ali Bilgin and Storer, {James A.} and Marcellin, {Michael W.} and Joan Serra-Sagrista",
booktitle = "Proceedings - DCC 2019",
address = "United States",
}