mathjs icon indicating copy to clipboard operation
mathjs copied to clipboard

refactor: Handle percent parsing by making it a Unit

Open gwhitney opened this issue 2 months ago • 0 comments

This PR represents an alternative to #3573 for solving the persistent issues with parsing % in the mathjs expression language. I think both should be evaluated before either is merged, and of course at most one or the other should be merged, not both.

My take on the characteristics of the two PRs:

#3573

  • minimal alteration to existing system and parsing to hopefully achieve the desired parses
  • several small changes to parser in multiple places
  • includes nearly all cases from #3563 and discussion #2861, although a couple of parses may be questionable but could probably be tweaked further to conform

this PR

  • More significant change; introduces new dimensionless unit 'percent' === '%'
  • Unifies all '%' and 'in' handling in one single place, at parseImplicitMultiplication, resolving #3562 (bad parse for '6 kg in^2') simultaneously; eliminates one level of parsing (parseUnaryPercentage)
  • Eliminates the isPercentage field of OperatorNode altogether, moving it closer to FunctionNode as desired.
  • Includes more (hopefully all) cases from #3563 and #2861, with some preferable parses, I believe, for
    • 10% + 20 % 3 (namely 10 mod +20 mod 3)
    • 20% 3 + 10 % (namely 20 mod 3 plus 10 percent)
    • 50 + (20% + 10%) (namely 50 plus 30 percent = 65)
    • x + (20% + 10%) (namely x plus 30 percent = 1.3x)
  • Brings the JavaScript behavior and the mathjs expression language behavior closer together, in that you can now also execute math.add(10, math.unit(17, 'percent')) to get the same effect as 10 + 17%. Or in other words, the percent symbol (when not used as modulo) is treated as semantically significant, rather than as syntactic sugar. Hence parse trees are more closely aligned with the text input.
  • Allows either the word 'percent' or the symbol '%' in expressions
  • May be viewed as a breaking change, because math.evaluate('10%') returns math.unit(10, 'percent') rather than 0.1, although math.number(math.evaluate('10%')) remains the same. In other words, if you coerce the results of evaluation (that might come out to a percentage) to a number, you get the same value. And the new Unit return value might be considered a feature, in that for a calculator application it might be more natural to see the result of 30% + 25% as 55% rather than as 0.55. This last point seems like a natural way to leverage the power of Unit and the possibility it allows of dimensionless units.
  • Parsing does unfortunately sometimes require a full trial look-ahead parse of an implicit multiplication, that ends up being thrown away. (It's used to reliably determine if you are adding a percentage to a percentage, and both the current code and the alternative solution use a similar trial parse.)
  • Mysteriously modifies the behavior of cube(math.unit(5, 'cm')) by some mechanism I honestly could not track down, but the modified behavior appears to be the correct one, in that the Unit class result of u.pow(3) is explicitly marked as allowing automatic simplification, and the new behavior is precisely the result of automatic simplification (e.g 0 cm^3 simplifying to 0 liter instead of staying 0 cm^3 as it did before, for reasons I can't fathom).

To sum up, this PR is definitely more radical and choosing it would depend on whether some of the above characteristics are seen as positives and/or simplifications that justify the bigger change.

@josdejong will have to determine which direction to pursue. Happy to adjust this PR as desired if it happens to represent the chosen strategy.

Resolves #2861. Resolves #3562. Resolves #3563.

gwhitney avatar Nov 06 '25 03:11 gwhitney