ANF Minimization
Hi!
I recently implemented the boolean function minimization algorithm described in the Gröbner Bases for Boolean Function Minimization paper and I recalled I previously stumbled upon Bosphorus, which, to my understanding, can simplify systems of polynomials in GF(2) (basically the same ANF expressions that are being fed to the Gröbner Bases algorithm).
Therefore I generated some random truth-tables, computed the ANF expressions for the positive output rows, generated the ideals being fed to the Gröbner Bases algorithm and tried to minimize them via Bosphorus instead. Although I obtained confusing results (due to my lack of understanding of the project, which I'm treating as a black box at the moment).
If I feed this ANF file as input (./bosphorus --anfread test.anf --anfwrite out.anf):
1 + x0 + x1 + x0 * x1 + x2 + x0 * x2 + x1 * x2 + x0 * x1 * x2 + x3 + x0 * x3 + x1 * x3 + x0 * x1 * x3 + x2 * x3 + x0 * x2 * x3 + x1 * x2 * x3 + x0 * x1 * x2 * x3
x1 + x0 * x1 + x1 * x2 + x0 * x1 * x2 + x1 * x3 + x0 * x1 * x3 + x1 * x2 * x3 + x0 * x1 * x2 * x3
x0 * x1 + x0 * x1 * x2 + x0 * x1 * x3 + x0 * x1 * x2 * x3
x0 * x2 + x0 * x1 * x2 + x0 * x2 * x3 + x0 * x1 * x2 * x3
x1 * x2 + x0 * x1 * x2 + x1 * x2 * x3 + x0 * x1 * x2 * x3
x0 * x3 + x0 * x1 * x3 + x0 * x2 * x3 + x0 * x1 * x2 * x3
x1 * x3 + x0 * x1 * x3 + x1 * x2 * x3 + x0 * x1 * x2 * x3
x2 * x3 + x0 * x2 * x3 + x1 * x2 * x3 + x0 * x1 * x2 * x3
x1 * x2 * x3 + x0 * x1 * x2 * x3
x0 * x1 * x2 * x3
I obtain this simplified output:
c Executed arguments: --anfread test.anf --anfwrite out.anf
x(0)*x(1)*x(2)*x(3) + x(0)*x(1)*x(2) + x(0)*x(1)*x(3) + x(0)*x(1) + x(0)*x(2)*x(3) + x(0)*x(2) + x(0)*x(3) + x(0) + x(1)*x(2)*x(3) + x(1)*x(2) + x(1)*x(3) + x(1) + x(2)*x(3) + x(2) + x(3) + 1
x(0)*x(1)*x(2)*x(3) + x(0)*x(1)*x(2) + x(0)*x(1)*x(3) + x(0)*x(1) + x(1)*x(2)*x(3) + x(1)*x(2) + x(1)*x(3) + x(1)
x(0)*x(1)*x(2)*x(3) + x(0)*x(1)*x(2) + x(0)*x(1)*x(3) + x(0)*x(1)
x(0)*x(1)*x(2)*x(3) + x(0)*x(1)*x(2) + x(0)*x(2)*x(3) + x(0)*x(2)
x(0)*x(1)*x(2)*x(3) + x(0)*x(1)*x(2) + x(1)*x(2)*x(3) + x(1)*x(2)
x(0)*x(1)*x(2)*x(3) + x(0)*x(1)*x(3) + x(0)*x(2)*x(3) + x(0)*x(3)
x(0)*x(1)*x(2)*x(3) + x(0)*x(1)*x(3) + x(1)*x(2)*x(3) + x(1)*x(3)
x(0)*x(1)*x(2)*x(3) + x(0)*x(2)*x(3) + x(1)*x(2)*x(3) + x(2)*x(3)
x(0)*x(1)*x(2)*x(3) + x(1)*x(2)*x(3)
x(0)*x(1)*x(2)*x(3)
x(0) + x(1) + x(2) + x(3) + 1
c -------------
c Fixed values
c -------------
c -------------
c Equivalences
c -------------
c UNSAT : false
If I look at the last polynomial (x(0) + x(1) + x(2) + x(3) + 1) I can see it matches with one of the polynomials part of the Gröbner Bases computed by the algorithm I'm using, although the system is not minimized. To be precise using the DegRevLex ordering I obtain the following minimized system:
x(0) + x(1) + x(2) + x(3) + 1
x(0)*x(1) + x(1)
At this point I think my misunderstanding comes from the fact that "simplification" in Bosphorus in meant as making the input ANF system easier to process for a SAT solver and it's not actually usable (out of the box at least) for a "minimization" of the input boolean expression (derived from the truth table).
My question is: Can Bosphorus be used to achieve the same "minimization" result I'm looking for?
Perhaps, my Boolean simplifier for MiniZinc might be of interest to you: https://github.com/hakank/hakank/issues/20
Greetings,
Axel Kemper
Hi Both,
Ah, nice! Bosphorus is doing only a heuristic "minimization". The goal is not really for the SAT transformation to be smaller, but it does of course make it smaller, too. I am actually really liking Alex's answer here, looks really cool!
Honestly, your example seems very cool and highlights that we could do better... But computing Grobner Basis would be super-expensive, we are only doing heuristic stuff. However, it seems like we are missing some things here... I wonder if we could translate some rewrite rules from SAT and SMT into ANF rewrite rules, thereby improving Bosphorus.... we are probably missing a bunch. If you have ideas for rewrite rules, or some transformation(s) that you think you could write as an addition for Bosphorus, I'd love to review, merge, or even collaborate on it! Computing a Grobner Basis is a bit too heavy-weight, usually, though. But I have a feeling we could go quite far with a number of (smart) rewrite rules/passes...
Let me know what you think,
Mate
PS: I am actually in Saarland right now, at Schloss Dagstuhl :)
PS, now that I think a bit more, I had this brain fart that computing GB is a bit like computing a (set of) minimal skolem function(s) in functional synthesis but lifted to multivariate polynomials over a finite field....
(Intro to functional synthesis: https://people.csail.mit.edu/asolar/SynthesisCourse/Lecture10.htm )
Thanks @a1880 for sharing your boolean simplifier, even though I've never used MiniZinc I'm actually going to allocate some time to understand it and give it a try!
Thanks @msoos for your answer and thinking about a collaboration.
My goal, at the moment at least, is precisely to use the fewer amount of (smart) rules or rewrite passes as possible, hence why my current approach involves an almost generic set of steps:
- Computing the Gröbner Bases (optionally simplifying the input polynomials via Espresso or QMC to reduce the computational cost).
- Greedily factoring the GF(2) polynomials (collapsing the
Xor(And(..), .., And(..))chains). - Applying a set-cover algorithm to remove redundant terms.
- Greedily factoring the full bases (collapsing the
Or(And(..), .., And(..))chain). - Optionally simplifying the output via Espresso or QMC.
Empirically what I observed (with another person working on the same problem) is that:
- The ordering during the Gröbner Bases computation can heavily affect the execution time (with DegRevLex being the fastest so far) and result quality (no surprise here).
- The Gröbner Bases approach can sometimes lead to suboptimal results if compared to just converting the input expression's truth-table to ANF and applying the greedy factorization steps.
- Ad-hoc smart rules/passes would likely get us far (especially at a fraction of the execution time), but it's not something I intend to pursue for my current research topic (hence why I'm testing and extending "generic" existing approaches that are reasonably fast for my needs).
Examples similar to the one I shared are fairly easy to generate and I found some that are entirely unaffected by Bosphorus' heuristics at the moment.
P.S. Since functional synthesis was mentioned, the Drill&Join approach is on topic for the synthesis of boolean functions in ANF form. Although I don't have the knowledge to comment on the Skolem function(s) intuition in the last comment :)
Quick question: Have you tried running Bosphorus before you compute the Grobner basis? It should make GB generation a lot faster I think... GB algorithms tend to use a single algorithm that's super-optimized, but don't preprocess the input ANF. I actually kinda wrote this tool to make ANF tools run faster, by running Bosphorus prior to running the ANF tools. It would be really interesting to me if you could check how Bosphorus affects runtime... my hope/guess is that it should significantly improve it.
Sorry for the delay, I actually spent some time testing this, but it feels like the majority of the systems I feed to Bosphorus are not simplified at all, so this must have something to do with the actual input shape I guess.
As a full example, this is the kind of system I'm feeding to Bosphorus after computing it as explained in the paper (individual ANF positive row contributions) and that I then feed to the Gröbner Bases algorithm to obtain the minimized boolean expression:
Input variables: 3
Input truth-table:
|x0|x1|x1|o|Individual ANF row contribution
|0 |0 |0 |0|
|1 |0 |0 |0|
|0 |1 |0 |0|
|1 |1 |0 |1|(x0 & x1 & x2) ^ (x0 & x1) => x0 * x1 * x2 + x0 * x1
|0 |0 |1 |1|(x0 & x1 & x2) ^ (x0 & x2) ^ (x1 & x2) ^ x2 => x0 * x1 * x2 + x0 * x2 + x1 * x2 + x2
|1 |0 |1 |0|
|0 |1 |1 |0|
|1 |1 |1 |0|
I tested this with increasingly bigger truth-tables and I didn't notice any improvement, the systems just stay the same. On the contrary preprocessing the truth-table via ESPRESSO, as mentioned in the original paper, leads to huge simplifications of the input system due to the introduction of don't care input bits and reduction in the amount of polynomials part of the input system (e.g. with a 15 input variables truth table with half the output bits randomly set to 1 I got a reduction from ~16000 input polynomials to ~4000 input polynomials).
Ah, interesting. Hmmm, I'd have to think about that... I wonder if e.g. https://github.com/berkeley-abc/abc would do waaaay better than espresso here. abc actually integrates espresso, but it does even more. What you are kind of doing is functional synthesis, and so abc would be a good choice. Have you tried?
I know about ABC, but haven't tried to compare it to ESPRESSO yet (my only fear is how complex it might be to integrate ABC's dependencies in my project, although I'll give it a shot as a standalone binary).
If in the future Bosphorus will be updated to handle similar cases, I'll gladly give it a try again. And if not, maybe this conversation will be useful to people looking into similar steps.
Thanks again for the suggestions in the conversation!
Sure, thanks for the interesting insights. It's very true that we are not doing much to do subsumption, i.e. removing redundant polynomials. I wonder if there could be an easy way to do it... For example:
x1+x2+x3*x4+x5*x6+x7 = 0
x1+x2+x3*x4 = 0
....
We could clearly simply to
x5*x6 + x7 = 0
x1+x2+x3*x4 = 0
...
Where the 1st equation allows us to replace x7 with x5*x6 everywhere, thereby eliminating x7 from the equation system altogether, and effectively removing that equation altogether -- we just need to remember that when we have found a solution, x7 needs to be computed from x5 and x6. This is a form of subsumption + bounded variable elimination (from the SAT world) lifted to the polynomials.... Maybe it's worth implementing. Essentially, we are looking for a (combination of) polynomial(s) "knocking out" parts of another polynomial altogether.