fend Rethink order of operations (multiplication/division)

trafficstars

Consider the three operators:

multiplication *
division /
function application/implicit multiplication/implicit addition (used in sin pi, 3 kg, (2)(4) or 1 1/2).

Current these are all parsed mostly left-to-right, with mostly the same precedence (and a bunch of special cases).

These calculations are correct and should continue working as they do now:

1 3/8 inches
1/2 kg
3pi/2 radians to degrees
-sin (pi/2)
5 feet 2 inches

These examples don't do what the user might expect:

1 meter / 5 seconds (parsed as (1 meter / 5) * seconds)
sin pi/2 (parsed as (sin pi)/2)

This calculation can't be parsed at all:

3sin 5 (should be equivalent to 3 * (sin 5))

It would be nice to come up with a consistent solution for all these expressions, preferably not involving whitespace.

May 20 '21 09:05 printfn

Also note #62

May 20 '21 09:05 printfn

Allowing sin pi/2 and sin pi / 2 feels too ambiguous imo

Sep 20 '23 23:09 probablykasper

Number followed by the unit name should have the highest precedence, followed by *, /, and then +, - operations.

Feb 25 '24 01:02 yurivict

@yurivict That would not work with 3/8 inches; it would cause it to be parsed as 3/(8 inches). I think, though, that if we see 3 mm / 4 seconds, this should cause the behavior you described, i.e., check for a unit immediately before the binary operator, and if it's present, then group units more tightly than the binary operator for whatever comes after the binary operator.

Apr 20 '24 06:04 mattfbacon

The previous approach mostly works well, except it breaks for 3i/2 m, giving m^-1 instead of m because it is grouped as 3i/(2 m) by the rule checking for an Apply before the binop.

I don't think there's any way to resolve this elegantly within the parser, because if we replace i with an actual unit like inches (i.e., 3inches/2 m), then the parser grouping as (3 inches) / (2 m) is actually correct.

I don't know if this means we need to special case certain identifiers like i. Regardless we will need to modify the result formatter to parenthesize the amount in these cases.

I'm glad that the unit tests check the output of the evaluation again, otherwise I wouldn't have caught this!

Apr 20 '24 07:04 mattfbacon

Looking into this, I think there are (at least) two reasonable approaches to address this:

Make parsing/lexing whitespace sensitive. With this approach, the user can express intent using the query itself:
- 3i/2 m would be interpreted as (3 i / 2) m.
- 3inches/2 m would be interpreted as (3 inches / 2) m (length^2)
- 3/2 m would be interpreted as (3 / 2) m.
- 3 / 2m would be interpreted as 3 / (2 m).
- 3 / 2 m, 3 inches / 2 m and 3 i / 2 m are still ambiguous. Most people would assume that they parse to (3 / 2) m, (3 inches) / (2 m) and (3 i / 2) m respectively, but there's no way of differentiating this during parsing.
Make parsing ident sensitive. That is, differentiate whether idents are functions, unitless constants, units, etc. before parsing tokens into an AST. This is similar to @yurivict's proposal.
- 3 i / 2 m would be interpreted as (3 i / 2) m.
- 3 inches / 2 m would be interpreted as (3 inches) / (2 m) (unitless).
- 3 / 2 m would be interpreted as (3 / 2) m.
- 1 m 3 sin 5 cm could be parsed as (1 m) + (3 sin 5) cm.

I think the latter would make the most sense as it addresses the ambiguous cases, but it would slightly complicate parsing with an extra step.

Jun 11 '24 12:06 mlcui-corp

Yeah I think that's a good idea and shouldn't be that hard since the environment should be available at that stage.

Jun 11 '24 19:06 mattfbacon

fend fend copied to clipboard

Rethink order of operations (multiplication/division)

fend
fend copied to clipboard