enaml icon indicating copy to clipboard operation
enaml copied to clipboard

.enamlc files are nondeterministic

Open bmwiedemann opened this issue 5 years ago • 3 comments

While working on reproducible builds for openSUSE, I found that our python-enaml 0.10.4 package varies from nondeterministic bits in .enamlc files.

/usr/lib64/python3.7/site-packages/enaml/workbench/ui/__enamlcache__/workbench_window.enaml-py37-cv26.enamlc differs at offset '655' 
-00000280  20 da 0b 6d 61 6b 65 5f  6f 62 6a 65 63 74 63 01  | ..make_objectc.|
+00000280  20 da 0b 6d 61 6b 65 5f  6f 62 6a 65 63 74 e3 01  | ..make_object..|
 00000290  00 00 00 00 00 00 00 03  00 00 00 0f 00 00 00 43  |...............C|

The entropy in there seems to only be 1 bit, so in 50% of the cases, 2 builds randomly have identical results. However, it should be easy to do 10 force-compiles of the relevant source like this:

for i in $(seq 1 10) ; do
   $FORCECOMPILE
   md5sum $ENAMLCFILE
done | sort | uniq -c

If everything was good, there should just be 1 line with 10 counts of the same md5.

See also https://reproducible-builds.org/ for why deterministic program behaviour is good.

bmwiedemann avatar Feb 07 '20 19:02 bmwiedemann

I am surprised because for the offset to be so far in the file it would mean marshal is not deterministic. The code generating the cache is rather straightforward (see here https://github.com/nucleic/enaml/blob/master/enaml/core/import_hooks.py#L297). Alternatively the code could be unstable but that is weird too.

MatthieuDartiailh avatar Feb 07 '20 20:02 MatthieuDartiailh

Ah, indeed I found problems with python marshal earlier: https://bugs.python.org/issue34033

bmwiedemann avatar Feb 08 '20 03:02 bmwiedemann

I guess that as long as pyc are not reproducible enamlc won't be since they are extremely close. I will keep this open in the meantime.

MatthieuDartiailh avatar Feb 08 '20 15:02 MatthieuDartiailh