Prove-It icon indicating copy to clipboard operation
Prove-It copied to clipboard

Canonical Labeling of Expressions

Open wdcraft01 opened this issue 9 months ago • 1 comments
trafficstars

Related to Issue #327 (Finite Function Order Check), it would be helpful to either modify the Expression.canonically_labeled() method or create a similar special-purpose function to perform canonical relabeling of an expression to facilitate cyclicality testing.

For example, suppose we have the Expression expr defined by:

expr = f( (f, g) ⟼ f(g), (f, g)⟼ g(f) )

consisting of a Function f with two arguments, each argument consisting of a Lambda map. Currently, calling the canonically_labeled() method on expr produces the expression:

expr = f( (_a, _b) ⟼ _a(_b), (_a, _b)⟼ _b(_a) )

where the two sub-expression Lambda map arguments are canonicalized independently (i.e., dummy variables substituted for parameters but without the substitution process for the second Lambda map acknowledging the substitution process using the same dummy variables for the first Lambda map). Such an output then makes cyclicality testing of expr challenging, with the parameters appearing in the Lambda map bodies appearing to operate on each other in an endless cycle.

Instead, at least for the purposes of cyclicality testing, it would be helpful to have the canonicalization process the expression “as a whole,” producing something more like this:

expr = f( (_a, _b) ⟼ _a(_b), (_c, _d)⟼ _d(_c) )

where the canonicalization of the second Lambda map does not reuse the dummy variables used in the canonicalization of the first Lambda map. The resulting expression then easily passes a cyclicality test.

Lambda maps are the only object type to return a potentially different form when canonicalized --- the canonicalization process replaces given Lambda parameters with dummy variables. So it's nice to know that modifications to the canonically_labeled() method can be fairly “localized”. The current method, however, achieves the canonicalization by recursively canonicalizing any sub-expressions, which then creates the “independent” result each time for each sub-expression. If we modify the process to eliminate such independence, does this pose any problems for the system? For example, suppose we separately canonicalize the following Lambda maps:

(x, y) ⟼ x + y and (v, w) ⟼ v + w

We would obtain (_a, _b) ⟼ _a + _b for each one, which also then allows us to see that the two maps are equivalent. If we combine the two maps into a single expression, say as elements of an ExprTuple such as:

( (x, y) ⟼ x + y , (v, w) ⟼ v + w )

and then canonicalize the ExprTuple (with a modified version of the canonicalization method), we would obtain ( (_a, _b) ⟼ _a + _b , (_c, _d) ⟼ _c + _d ), which is what we might want for cyclicality testing, but now can we still recognize that the two Lambda maps are actually equivalent? Does it matter? Is it possible that a separate, more limited canonicalization function might be useful just for cyclicality testing?

wdcraft01 avatar Feb 03 '25 21:02 wdcraft01

As Deepak has pointed out, what I've referred to as a canonical relabeling process is commonly referred to as alpha-conversion or alpha-renaming in the Lambda calculus. It might be wise to make the distinction between (1) canonicalization vs. (2) alpha-conversion, where: (1) By canonicalization, we mean the construction of canonical or standard forms of Lambda maps so that Lambda maps that are alpha equivalent are recognized as such; we use a deterministic form of alpha-conversion to achieve canonicalization; (2) By alpha-conversion or alpha-renaming, we mean the process of renaming bound variables, often with the goal of avoiding variable name conflicts, but also as a frequent goal the identification of Lambda maps as being alpha equivalent.

wdcraft01 avatar Feb 21 '25 17:02 wdcraft01