Is custom version of Data.Binary still needed?
This is not urgent by any means, but I'm bringing this up so that it's not buried in old commit messages.
The GF codebase includes a modified version of Haskell's Data.Binary from 2008. According to Grégoire's commit from 2010, this modification was necessary, because the binary representation of doubles in Data.Binary was (is?) not compliant with IEEE 754.
In the commit history, you can see a comment from 2013:
The standard binary package has improved efficiency and error handling [1], so in the long run we should consider switching to it.
Is this still the case? The code from 2008 contains some unsafe IO operations (inlinePerformIO is deprecated and has a newer, more accurate name accursedUnutterablePerformIO). Later versions of Data.Binary don't contain accursedUnutterable…, at least not in those functions I checked.
Maybe more interesting, the implementation of the Get monad has changed, see current version of Binary vs. old version used in GF.
Ways to proceed
From least work to most work:
- No real changes: Keep the modified Data.Binary from 2008 with all its curses,
accursedUnutterablePerformIOis actually necessary in our code (for performance reasons?). - Keep the modified Data.Binary from 2008, but replace accursed unutterable functions with something safer, like
unsafePerformIO. - If Data.Binary has fixed its representation of doubles and any other issues for which the custom module was used: Switch to the latest version of Data.Binary library. (+Write some proper property-based tests to make sure that the change doesn't break GF.)
According to Data.Binary's changelog, the IEEE-754 compliance was fixed in 0.8.4.0. (PR #119). So maybe it's worth testing option 3.
IEEE-754 compliance is not the only problem. The standard Binary package produces very bloated binary files. Our implementation uses variable length integers which makes the files a lot shorter.
We can switch to the standard Binary package but only if we don't use their type classes. We can use for instance custom functions like putString and getString which store string lengths in variable length integers instead of the standard overloaded put/get. In this way we at least reuse backend machinery from the standard package.
On Mon, 9 Nov 2020 at 12:58, John J. Camilleri [email protected] wrote:
According to Data.Binary's changelog https://hackage.haskell.org/package/binary-0.8.4.0/changelog, the IEEE-754 compliance was fixed in 0.8.4.0. (PR #119 https://github.com/kolmodin/binary/pull/119). So maybe it's worth testing option 3.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/82#issuecomment-723968893, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZGZ4TRR4LL4Y4PWOG3SO7KPXANCNFSM4TPGOOHQ .