cakeml
cakeml copied to clipboard
Formalise DEFLATE and add implementation to examples
Deflate is a lossless data compression file format that zip and gzip are based on.
The deflate algorithm would be a nice formalisation and verification exercise in HOL. The result of the formalisation effort could be used to construct a verified gzip-like executable using CakeML.
The same compression and decompression algorithm might come in handy inside the CakeML compiler, once the compiler stores bulky constant data structures as read-only data. Such bulky constants could, for example, be the AST for the basis library or the bignum library.
This project is likely to fit the scope of an internship or MSc thesis.
Related work:
https://hal.archives-ouvertes.fr/hal-02149909/document
On Mon, Nov 16, 2020 at 8:07 AM myreen [email protected] wrote:
Deflate https://en.wikipedia.org/wiki/DEFLATE is a lossless data compression file format that zip https://en.wikipedia.org/wiki/ZIP_(file_format) and gzip https://en.wikipedia.org/wiki/Gzip are based on.
The deflate algorithm https://tools.ietf.org/html/rfc1951 would be a nice formalisation and verification exercise in HOL. The result of the formalisation effort could be used to construct a verified gzip-like executable using CakeML.
The same compression and decompression algorithm might come in handy inside the CakeML compiler, once the compiler stores bulky constant data structures as read-only data. Such bulky constants could, for example, be the AST for the basis library or the bignum library.
This project is likely to fit the scope of an internship or MSc thesis.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/CakeML/cakeml/issues/806, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIOAD7JSIL7TJM64H4F4YTSQEWZLANCNFSM4TXGSYBA .
Work on this could advance in stages:
- Implement the compression and decompression algorithms as functions in HOL
- Make it easy to prove
∀s. decompress (compress s) = s
by makingcompress
test whetherdecompress
works for its output: if not, then it can return its input tagged as uncompressed. - Extensively test that
compress
won't produce uncompressed data. - Prove that the
compress
function always produces properly compressed data.
Something very similar to the calculation of a Huffman tree given the weights for the alphabet is done in HOL’s examples/computability/kolmog/kraft_ineqScript.sml
. (Weights are specified by giving the desired length of the code for each “symbol”.). It should probably be possible to refactor so as to extract the code for the tree-building etc.
I'm going to give this a go for my Honours project.
slowly progressing here