mathlib icon indicating copy to clipboard operation
mathlib copied to clipboard

computer algebra system style tactics?

Open kim-em opened this issue 5 years ago • 9 comments

@urkud suggested on Zulip that we may want to introduce some CAS style tactics for manipulating algebraic expressions. I think these could be really nice! They would provide an intermediate level between manually rewriting and simplifying expressions and using decision procedures / normal forms like ring.

Some suggestions (in approximate order of difficulty of implementation?)

  • [ ] expand (=simp only [mul_add, add_mul, mul_sub, sub_mul, smul_*]?),
  • [ ] group by powers of x (what Mathematica calls Collect)
  • [ ] cancel a fraction;
  • [ ] together (i.e. put terms over a common denominator) (probably field_simp does this job?);
  • [ ] a version of field_simp that adds goals a ≠ 0 instead of failing to apply a lemma.
  • [ ] apart, the opposite of together, e.g. trying to write 1/(x*(x+1)) as 1/x - 1/(x+1)
  • [ ] factorize (polynomials, multivariables polynomials, ...?) (this requires some new maths, and is rather open ended),

This issue should probably at most serve to argue over the value of these and track interest in implementing particular tactics: serious work on one might deserve is own issue.

kim-em avatar May 18 '20 11:05 kim-em

Pre-empting the criticism, perhaps we should not attempt to re-implement a computer algebra system in Lean! Rob Lewis has already done some great work calling Mathematica directly from Lean, cf https://robertylewis.com/leanmm/.

Rather than implementing any of these tactics, it might be better to think about how to make that work easily usable from mathlib.

kim-em avatar May 18 '20 11:05 kim-em

Do we want something like Gauss elimination for matrices?

jcommelin avatar May 18 '20 11:05 jcommelin

What is factorize? An inverse to expand?

gebner avatar May 18 '20 12:05 gebner

Personally, I think that a mathematica/sage interface is only useful for hard computations that have intelligible certificates. E.g. computing antiderivatives is probably a good example.

For cancel the best certificate is probably the tactic itself. I would probably want to use it as a small step in a tactic script. If I had to include a certificate, I'd rather opt to golf a proof term.

One point that is often raised on Zulip is that if we have a mathematica/sage-interface tactic producing certificates, then all maintainers will have to install the correspond external software, just in order to fix proofs when they break. For complicated calculations, this is IMHO worth it. But for easy proofs that will break more easily (like cancel, etc.), this seems cumbersome.

Another advantage of mathlib tactics is that they work on all structures we have in mathlib: I assume it would be hard to use an off-the-shelf CAS to solve problems in groups with zeros.

gebner avatar May 18 '20 12:05 gebner

What is factorize? An inverse to expand?

I updated the top issue to clarify this is intended for factorising expressions as polynomials, or as multivariable polynomials in all the atoms (relative to some set of operations...).

kim-em avatar May 18 '20 12:05 kim-em

For many of these tactics, e.g. the current PR #2536 for factoring integers, factorize, or the Gaussian elimination tactic @jcommelin mentions above, most of the work actually goes into the non-meta code that describes an algorithm "doing the work". We then just need to agree on a common framework for tactics which introduce the new facts asserted by such an algorithm.

But at least for myself, I don't know what that common framework should look like.

kim-em avatar May 18 '20 13:05 kim-em

I think we absolutely should implement a ("verifying") computer algebra system in Lean. I'm not sure whether it is better to do it now, or wait for Lean 4. I'm also not sure whether this issue is really about something I would think of as a CAS, or just a grab bag of tactics that do various algebraic manipulations. A CAS should have components which can work together in a sensible way. Johan's example of Gaussian elimination is a good one. Imagine you want to use Gaussian elimination to calculate the rank of a matrix defined over a function field k(X). The Gaussian elimination routine needs to know how to how to add and multiply rational functions and how to determine whether a rational function is zero, which in turn needs an equality test for the field k. So we need a way to choose this equality test, and ways for these components to communicate. We also need a way to represent the data involved, and there are multiple choices (e.g., sparse or dense matrices, and sparse or dense polynomials). I would assume that Lean's expr is probably not the correct choice of data representation. However, I've never tried to implement a CAS.

My sense though is that this is not really what this issue is about, which is fine, we just should not confuse these kinds of tactics for a proper system.

One issue with these sort of tactics if they are nonterminal is that they may either produce very non-simplified output, or output which is hard to predict the exact form of. So maybe it's better to use a "suggester" style, where you can write

  have : 1/(x*(x+1)) = _, by apart,

and then the tactic tells you how to fill in the _; and then you should probably replace the tactic by a different one (field_simp here).

rwbarton avatar May 19 '20 04:05 rwbarton

In sympy cancel works with a ratio of multivariate polynomials. Of course, the equality holds only if we cancel a non-zero factor.

urkud avatar May 19 '20 22:05 urkud

Based on this discussion, it looks like Lean 4 may not be an ideal scratch pad for an AI trying to learn math. While some of the features in CAS would help, it may not be certified. If CAS-like simplifications are present in Lean (more like a "verified computer algebra system"), then it can help AI systems trying to learn math.

amit9oct avatar Feb 21 '24 22:02 amit9oct