Cannot always parse a unit deserialized from JSON via the expression parser
From discussion https://github.com/josdejong/mathjs/discussions/3031#discussioncomment-14692393
When formatting a unit, it is not always possible to parse it via the expression parser when it was revived from JSON data.
Example:
const unit1 = math.evaluate('t=9cm * 7in^2')
console.log('unit1', unit1.toString())
// "24.803149606299215 in^3"
const data = JSON.stringify(unit1, math.replacer)
console.log('data', data)
// {"mathjs":"Unit","value":63,"unit":"cm in^2","fixPrefix":false}
const unit2 = JSON.parse(data, math.reviver)
console.log('unit2', unit2.toString())
// "63 cm in^2"
console.log(math.parse(unit2.toString()));
// SyntaxError: Value expected (char 9)
The outcome 63 cm in^2 is technically correct, but the parser cannot parse this due to a conflict with in interpreted as the unit conversion operator a in b instead of the unit inch.
It looks like serializing and reviving either loses some of the original information, since after deserialization the .toString() method gives a different outcome.
Maybe we should normalize the unit before serialization, or add additional state ensuring that the unit .toString() gives the same output as the original.
It turns out that the serialized unit loses the property this.skipAutomaticSimplification.
Still, we can address this issue in two ways:
- serialize the property
skipAutomaticSimplification - normalize the unit before serialization
Or we can do both 😄
Then, separately we have the issue of a stringified unit not always being parsable. In the case above, the unit can be simplified such that we do not encounter the problem. But the following example is not parsable with the expression parser anyway:
const unit = math.evaluate('2 kg * 3 in^2')
console.log(unit.toString()) // "6 kg in^2"
console.log(math.parse(unit.toString()))
// Uncaught SyntaxError: Value expected (char 8)
What we maybe can do is: when the unit contains in (conflicting with the conversion operator, do not use implicit multiplication but stringify the unit like "6 kg*in^2"
It turns out that the serialized unit loses the property
this.skipAutomaticSimplification.Still, we can address this issue in two ways:
1. serialize the property `skipAutomaticSimplification` 2. normalize the unit before serialization
Yes, I thought something like that was going on. I think therefore at least (1) must be done, and I think that makes (2) moot, doesn't it?
But the following example is not parsable with the expression parser anyway:
const unit = math.parse('6 kg in^2')
Yes this is clearly a bug. Shouldn't it just be that in followed by any operator is never the in operator but instead the in unit? I.e., fix the parser bug rather than work around it by changing the stringification, since one might conceivably directly write something like 6 lb in^2?
Shouldn't it just be that
infollowed by any operator is never theinoperator but instead theinunit?
Yes indeed, I think that would work, that is a good idea. I think there are no other special edge cases with in and units since units only have a specific notation (a list with units, optionally with an exponent, which are implicitely multiplied).
The serialization issue is addressed via #3572.
I keep this issue open to also look into solving the parsing issue with math.parse('6 kg in^2').
Just going back to my hacky solution, could we not make in the operator configurable? I noticed there's some legacy options that can be explicitly turned on? My understanding is in is kept for legacy reasons right?
I understand not wanting to complicate the parser though...
Apart from solving the in issue in the implementation, I think having in as both a unit and an operator is not ideal, as it makes those expressions not being well-defined. Because it is not possible to tell which has been the intent in some cases, e.g.:
1 in in in
is that 1 inch^3 or 1 inch converted to inch?
there is a similar clash with min being both a unit and a function, but that seems fine because it is not an operator, and should be followed by brackets if the function case has been the intention
1 in in inis that 1 inch^3 or 1 inch converted to inch?
Yeah that is a nice demonstration of the ambiguity that the operator in causes right now 😄. In case of doubt: use parentheses.
Just going back to my hacky solution, could we not make
inthe operator configurable? I noticed there's some legacy options that can be explicitly turned on? My understanding isinis kept for legacy reasons right?
Good idea. We could deprecate the operator in over time (start with a warning explaining to use to instead of in), or put it behind a feature flag and turn it off by default in the future. Of course you can already choose yourself to not use operator in of course to prevent ambiguous situations.
The orginal serialization issue is fixed now in [email protected] via #3572.
I'll keep this issue open to think through fixes for the parser on handling in.