gftools icon indicating copy to clipboard operation
gftools copied to clipboard

Make UFO merge more deterministic

Open Hoolean opened this issue 3 years ago • 0 comments

Currently, the output of the merge process is non-deterministic: if the tool is run twice with the same input UFOs, the outputs will differ. This is undesirable, as it produces lengthy and unnecessary UFO diffs, and makes the overall process the merge is part of harder to test.

Cause

This originates when values are represented as sets: the existing order of properties is discarded, and the Python implementation is at liberty to choose whichever order it deems most convenient from run to run.

Fix

While using sets is desirable – they are the most appropriate data structure – we wish to limit the number of orders they can take on when being converted back into lists. This commit converts the sets back into lists through sorted() instead of list(), which makes the order at runtime deterministic.

Caveats

These changes stop short of normalisation: while often desirable, it is outside the scope of this tool, and is best left to ufonormalizer and friends. In contrast, we cannot do these changes upstream in ufonormalizer, as it preserves the order of certain fields that legacy font editors rely on (but which UFO merge already implicitly discards).

In addition, there are further uses of sets left unsorted; it is unclear whether their order at runtime affects output.

Testing

This PR appears to function as expected following limited local testing but would benefit from automated test coverage or exposure to further UFOs : )

Hoolean avatar Aug 18 '22 14:08 Hoolean