ModelicaSpecification
ModelicaSpecification copied to clipboard
Applying Hindley-Milner to unit-checking
This is fairly complete proposal for unit-checking based on the Hindley-Milner inference algorithm. This corresponds to the implementation in Wolfram System Modeler.
I will have to study it in more detail - and in particular would want to first consider just the definition of the unit-checking and then consider the variants for adding units.
For the unit-checking I noticed that several functions were not considered (but could trivially be added - Dymola already supported these checks except cat
- and some part of array-handling):
-
cross
- treated as multiplication -
interval
- return "s" -
hold
,sample
,subSample
,superSample
,previous
,fill
,skew
,noEvent
,abs
,pre
,promote
where the unit of the result corresponds to the unit of the first argument (and other arguments are ignored for unit-checking) -
{}
,[]
,array
(and sort ofscalar
,vector
,matrix
) (assuming we don't scalarize first) where all arguments must have the same unit and the result then also has this unit - similarly for
cat
ignoring first argument -
smooth
where the unit corresponds to the unit of the second argument (and other arguments are ignored for unit-checking) -
sqrt
should be treated as power with rational exponent 1/2; I also find the general restrictions on the power operator somewhat lacking
And something for min, max, transpose, diagonal, sum, product.
Dymola also treats the different transcendental functions differently regarding "rad" vs. "1"; even though those differences are later ignored.
It was a conscious choice not to include an exhaustive list of builtin functions as, as you said, they can be trivially added later if we agree on the base mechanism. Similarly, for arrays, previous attempt at standardizing unit-checking have focus on scalar expressions, but as mentioned in the document it can be generalized to element-wise operations on arrays. And we are confident that it will work with matrix operations too, but we don't have a proof of concept for that last part.
Three comments on the proposal.
- A number of examples would help explain a lot what it does, what are valid cases and what are not.
- I would like to see any such scheme applied to some real-world libraries (not only MSL), and what the impact would be.
- What does this proposal do that existing tools cannot do?
In general, I think this is the kind of extension that is best formalized through a layered standard, as there is not change for correct models.
As far as I understand there are three parts to this (now that we added fractional powers in the unit-grammar):
- The unit-constraints (such that a,b,c must have the same unit in
c=a+b;
; and the more complicated rule for multiplication). They are important, and although fairly clear there are some minor issues to discuss regarding the details. - How to handle unknown units and literals.
- That we solve for the unit of variables one at a time (as a I understand Hindley-Milner is just a smart way of doing that).
Regarding (1): Of which unit-constraints are active. There are some minor details of which unit-constraints are active that are important (only active equations?, and clearly allowing different units for different instances of the class corresponding to "let-polymorphism"). Note that these details are important if we are considering storing the deduced units in the model (only possible for the current model, and even then it can be problematic). Having some way of de-activating unit-checking per equation (or model, package) also seems needed in practice, but is just a minor detail.
Regarding (2): The handling of literals is the major new change; and originally Dymola's unit-check had literals as total wild-cards which wasn't good triggering this MCP, whereas requiring units for every literal would be too much work for users for too small a benefit.
Regarding (3): One alternative for solving for units one variable at a time is to view the unit-constraints as a large (over and under-constrained) system of equations in rational numbers; another is to have more ad-hoc rules (so that we have one set of rules for checking units and another for reducing - we would then need to ensure consistency etc). Having some restriction on when to deduce (like for (global) constants can still be added to the current idea.
To illustrate the alternative of system of equations consider two problematic equations:
k*der(x)=-x;
cos(y*z)+cos(y/z)=f(time)
Assuming no units are given the approach with solving for variable-units one at a time cannot find any units.
With unit-constraints as equations we would for the first equation have: unit.k+unit.x-[s]=unit.x which gives: unit.k=[s], and no constraint for unit.x
And for the second equation: unit.y+unit.z=[1] unit.y-unit.z=[1] which gives: unit.y=[1], unit.z=[1].
Obviously this alternative would be more powerful, but:
- I'm not sure exactly how to efficiently implement the equation solving (well, just use a sparse solver...), especially considering the next item:
- Diagnostics will be harder to understand when it works; instead of saying that we got the unit of x from expression x...y where y had unit ... we now get units for multiple variables from multiple expressions, making it harder to understand
- Diagnostics when it fails will be harder still (will be discussed later)
- I don't know if there are realistic use-cases where it will give a significant gain. The goal isn't that people write large models with 100+ variables and equations and then add the unit for one variable and everything is magically handled, but more that perhaps 80% of variables have unit and the algorithm checks for mistakes and fills in the rest.
Regarding diagnostics when it fails I would more say it is a matter of finding the possible problematic equations, and in my experience analyzing that even for a handful of equations takes care - and domain knowledge; whereas this other method seem designed for a large number of equations.
For the power operator I would just split it based on the type of the exponent:
- Integer: Integer-power even if not evaluated
- Real: Treat the same way as: exp(ln(x)*y); - so both must have unit="1".
For the power operator I would just split it based on the type of the exponent:
* Integer: Integer-power even if not evaluated
What would be the unit of y
in the following models then?
model Test
Real x(unit = "m");
Integer i;
Real y = x^i;
initial equation
i = 2;
equation
when (time>0.4) then
i = 3;
end when;
end Test;
model Test2
Real x(unit = "m");
parameter Integer i = 1;
Real y = x^i;
end Test2;
For context in the current state of the proposal, we would consider both models invalid because x
doesn't have unit "1"
.
@qlambert-pro I would consider your example an ill-conditioned model without physical meaning and rewrite it:
model Test
Real x(unit = "m");
constant Real x0(unit = "m")=1;
Integer i;
Real y = (x/x0)^i;
initial equation
i = 2;
equation
when (time>0.4) then
i = 3;
end when;
end Test;
@qlambert-pro I would consider your example an ill-conditioned model without physical meaning and rewrite it:
model Test Real x(unit = "m"); constant Real x0(unit = "m")=1; Integer i; Real y = (x/x0)^i; initial equation i = 2; equation when (time>0.4) then i = 3; end when; end Test;
I can see this - and it would be similar to figuring out the unit for Real z=if time>=0.4 then x else x*x;
@qlambert-pro I would consider your example an ill-conditioned model without physical meaning and rewrite it:
model Test Real x(unit = "m"); constant Real x0(unit = "m")=1; Integer i; Real y = (x/x0)^i; initial equation i = 2; equation when (time>0.4) then i = 3; end when; end Test;
I can see this - and it would be similar to figuring out the unit for
Real z=if time>=0.4 then x else x*x;
This all makes probably little sense physically, but for cybernetic model parts I wonder how to easily lookup a magnitude with implicit unit = "1"
? It looks cumbersome to have to use some constant
with identical units to make expressions dimensionsless quickly, if you know what you are doing.
@gwr69 I'm an engineer, to me it's not cumbersome. I prepare my equations from textbook or from my own research before I start modeling. As long as equations are not meeting in units they're bad and not fit for modeling. In some cases - even from a physical point of view - you have to use dimensionless variables, i.e x/x_ref. In many cases x_ref has a meaningful value different from1.
@gwr69 I'm an engineer, to me it's not cumbersome. I prepare my equations from textbook or from my own research before I start modeling. As long as equations are not meeting in units they're bad and not fit for modeling. In some cases - even from a physical point of view - you have to use dimensionless variables, i.e x/x_ref. In many cases x_ref has a meaningful value different from1.
@AHaumer I always envy the engineers and how they use 7 base units for everything. Coming from economics and business modeling, I would certainly like to apply the rigorous procedures and unit checking that you describe—and to which I whole heartedly agree—to more abstract fields than engineering. Unfortunately, there are no monetary units in Modelica and units of time typically end with the "day" ignoring that even in scientific realms years are frequently used, even though they are typically variable units of time.
Of course, modeling in those abstract and "soft" realms will appear simple, if you only allow yourself the use of 1
for counting. But how to build models for predator-prey dynamics where the nonlinear terms call for dimensionless rates per predator or per prey and give them proper units different from 1
as there may quickly be equation errors if you neglect those differences? This is what I meant with "cumbersome" and sometimes one may need to strike a careful balance.
I wonder how to easily lookup a magnitude with implicit
unit = "1"
This proposal helpfully provides an operator just for that purpose:
withoutUnit(expr, "unit")
It outputs the magnitude of expr
after conversion from its unit to "unit". Which still provides you with unit-safety, but allows you to slip in a unit less world when you need it.
I can see this - and it would be similar to figuring out the unit for
Real z=if time>=0.4 then x else x*x;
Which this proposal treat a similar way, it requires both side of an if whose condition hasn't been evaluated to have the same unit.
I find it sub-optimal if a particular algorithm ("Hindley Midler") is required in the specification: To my knowledge, a concrete algorithm was never required in the Modelica Specification, simply because there might be different algorithms with pros and cons and the tool vendor should decide which algorithm to use.
Please, add examples to the proposed specification, as already mentioned.
If better unit-checking is introduced, it must be introduced in a way, that existing models can still be compiled and simulated, otherwise, people will be very upset. From a tool perspective, there should be an option to either ignore unit checking, give a warning or an error (giving a warning as a default). From my experience with Julia this is really essential, because very strict unit checking (as in Julia) can be a complete nightmare for a modeler who just wants to quickly produce a result and does not have time to search for a unit issue. The question is, how to define this in the specification, in order that unit checking is optional.
As far as I understand there are three parts to this (now that we added fractional powers in the unit-grammar):
1. The unit-constraints (such that a,b,c must have the same unit in `c=a+b;`; and the more complicated rule for multiplication). They are important, and although fairly clear there are some minor issues to discuss regarding the details. 2. How to handle unknown units and literals. 3. That we solve for the unit of variables one at a time (as a I understand Hindley-Milner is just a smart way of doing that).
... Regarding (3): One alternative for solving for units one variable at a time is to view the unit-constraints as a large (over and under-constrained) system of equations in rational numbers; another is to have more ad-hoc rules (so that we have one set of rules for checking units and another for reducing - we would then need to ensure consistency etc). Having some restriction on when to deduce (like for (global) constants can still be added to the current idea.
To illustrate the alternative of system of equations consider two problematic equations:
k*der(x)=-x; cos(y*z)+cos(y/z)=f(time)
Assuming no units are given the approach with solving for variable-units one at a time cannot find any units.
With unit-constraints as equations we would for the first equation have: unit.k+unit.x-[s]=unit.x which gives: unit.k=[s], and no constraint for unit.x
And for the second equation: unit.y+unit.z=[1] unit.y-unit.z=[1] which gives: unit.y=[1], unit.z=[1].
Obviously this alternative would be more powerful, but:
* I'm not sure exactly how to efficiently implement the equation solving (well, just use a sparse solver...), especially considering the next item: * Diagnostics will be harder to understand when it works; instead of saying that we got the unit of x from expression x...y where y had unit ... we now get units for multiple variables from multiple expressions, making it harder to understand * Diagnostics when it fails will be harder still (will be discussed later) * I don't know if there are realistic use-cases where it will give a significant gain. The goal isn't that people write large models with 100+ variables and equations and then add the unit for one variable and everything is magically handled, but more that perhaps 80% of variables have unit and the algorithm checks for mistakes and fills in the rest.
Regarding diagnostics when it fails I would more say it is a matter of finding the possible problematic equations, and in my experience analyzing that even for a handful of equations takes care - and domain knowledge; whereas this other method seem designed for a large number of equations.
I think the answer is simple from your analysis: Use only the units for one variable at a time, because simpler and better diagnostics. If the user is not satisfied with unit propagation, he just has to add more units at some places. Its much more critical if it is hard to figure out why a tool has selected a unit based on the analysis of many equations
I find it sub-optimal if a particular algorithm ("Hindley Midler") is required in the specification: To my knowledge, a concrete algorithm was never required in the Modelica Specification, simply because there might be different algorithms with pros and cons and the tool vendor should decide which algorithm to use.
Making the Hindley-Milner algorithm a requirement of the specification was never our intention, but it relates to a state of the art type-system, which shares a lot of properties with the way we want unit to work in Modelica. We felt the need to spell it out in this document for two reasons:
- To demystify the way it works and to allow us to talk about constraint while being clear about this constraints lead to unit inference and checking
- Because the type system in its usual form is, effectively, impaired to allow models without units (e.g. HelloWorld) to still work, and this comes with notable difference.
I find it sub-optimal if a particular algorithm ("Hindley Midler") is required in the specification: To my knowledge, a concrete algorithm was never required in the Modelica Specification, simply because there might be different algorithms with pros and cons and the tool vendor should decide which algorithm to use.
Making the Hindley-Milner algorithm a requirement of the specification, but it relates to a state of the art type-system, which share a lot of properties with the way we want unit to work in Modelica. We felt the need to spell it out in this document for two reasons:
- To demystify the way it works and to allow us to talk about constraint while being clear about this constraints lead to unit inference and checking
- Because the type system in its usual form is, effectively, impaired to allow models without units (e.g. HelloWorld) to still work, and this comes with notable difference.
But consider the following alternative (not efficient):
- Do
- Go through all unit-constraints in some order (using the same rules to deduce units from individual equations/expressions). If we can uniquely deduce the unit for one variable (or one expression) based on one equation/sub-expression do that.
- While (some progress)
Will this give the same result? (Except for error diagnostics when it fails.)
If better unit-checking is introduced,
There is no such thing as unit-checking in the specs. This is literally what this MCP is about.
I think the answer is simple from your analysis: Use only the units for one variable at a time, because simpler and better diagnostics. If the user is not satisfied with unit propagation, he just has to add more units at some places. Its much more critical if it is hard to figure out why a tool has selected a unit based on the analysis of many equations
One could also write: If the user is not satisfied with unit diagnostics, they just have to add more units at some places. Its much more critical if it is hard to know in what unit the source of a signal is expressed, as it can lead to silent numerical errors.
Will this give the same result? (Except for error diagnostics when it fails.)
I think the place it may differ is for models like HelloWorld. Where, a sensible unit-system would find an error, but the Modelica community wants it to be valid. Which we have found to add complexity, and justified us spelling the algorithm out.
Will this give the same result? (Except for error diagnostics when it fails.)
I think the place it may differ is for models like HelloWorld. Where, a sensible unit-system would find an error, but the Modelica community wants it to be valid. Which we have found to add complexity, and justified us spelling the algorithm out.
I don't see how it would differ (I assume HelloWorld means a simple k*der(x)=-x;
model where we don't derive unit for k?). I understand that solving it as a system of equations would give additional units, but I don't see where this simple algorithm would differ from Hindley-Milner - and I believe we need to understand this.
Will this give the same result? (Except for error diagnostics when it fails.)
I think the place it may differ is for models like HelloWorld. Where, a sensible unit-system would find an error, but the Modelica community wants it to be valid. Which we have found to add complexity, and justified us spelling the algorithm out.
I don't see how it would differ (I assume HelloWorld means a simple
k*der(x)=-x;
model where we don't derive unit for k?). I understand that solving it as a system of equations would give additional units, but I don't see where this simple algorithm would differ from Hindley-Milner - and I believe we need to understand this.
Great example! Make it der(x) = k * x
for the economists and all of a sudden nobody would want to enter a magnitude matching 1/s
and entering the value from your broker/bank will get you in numerical trouble soon. ;-)
What took getting used to for me (the BusinessSimulation as of v2.2.0-wsm
passes all unit checks in WSM without abuse of “fudge factors”, e.g. 1[“”]
) is that I would expect that users know what they enter (input
and parameters and constants), but they can only have expectations regarding the results (output
) as there may be equation errors hidden in a model. Setting units for an ouput connector
is helpful as it makes unit checks fail earlier. At the same time it is also confusing as the unit values are not seen as expectations but as definite values to “solve unit equations.” Finding the location of a unit error still is not easy imho.
I don't see how it would differ (I assume HelloWorld means a simple
k*der(x)=-x;
model where we don't derive unit for k?). I understand that solving it as a system of equations would give additional units, but I don't see where this simple algorithm would differ from Hindley-Milner - and I believe we need to understand this.
So I don't think they would necessarily differ from Hindley-Milner. Although, the details contained in "If we can uniquely deduce the unit for one variable" can mean that a tool is able to infer something that another isn't but there is an inconsistency in the way a variable is used. However, the version that is presented here support the following model for instance:
model HelloWorld
Real x(start = 1);
equation
der(x) = -x;
end HelloWorld;
on the basis that if a user doesn't set units for their variables they probably aren't interested in getting unit diagnostics. It was my understanding that people like Martin would probably would like that to be the case. But if I misunderstood, I am more than happy to consider that an error. It will simplify the proposal a lot.
Finding the location of a unit error still is not easy imho.
I think that this is a tool issue, on our side we are both thinking of ways to make the derivation more intuitive as well as showing the user how we reached our conclusions.
I don't see how it would differ (I assume HelloWorld means a simple
k*der(x)=-x;
model where we don't derive unit for k?). I understand that solving it as a system of equations would give additional units, but I don't see where this simple algorithm would differ from Hindley-Milner - and I believe we need to understand this.So I don't think they would necessarily differ from Hindley-Milner. Although, the details contained in "If we can uniquely deduce the unit for one variable" can mean that a tool is able to infer something that another isn't but there is an inconsistency in the way a variable is used.
If we can agree on the unit-constraints I don't see how that could occur.
On the other hand the problem with describing unit-inference in terms of Hindley-Milner (or algorithm W - or some other letter) is that it doesn't fully answer all questions, in particular:
- That algorithm is described in terms of type inference in functional languages; so one has to describe a mapping between units and types in functional languages. There are also basic parts of the algorithm that seem needlessly complex for unit-inference, e.g., unify.
- The consequences of the unit-inference based on this algorithm doesn't seem clear. As far as I understand it took more than 10 years to prove that the algorithm actually generated the optimal result (in terms of extracting all type information) for the type inference case. Assuming that it generates the result I think it does for unit-inference we already know that the result isn't "optimal" (in the sense of extracting all unit information, but as previously explained I don't think the "optimal" result is good for Modelica models).
As far as I understand the actual algorithm is shorter than the text above.
- However, the version that is presented here support the following model for instance:
model HelloWorld Real x(start = 1); equation der(x) = -x; end HelloWorld;
on the basis that if a user doesn't set units for their variables they probably aren't interested in getting unit diagnostics. It was my understanding that people like Martin would probably would like that to be the case. But if I misunderstood, I am more than happy to consider that an error. It will simplify the proposal a lot.
As far as I understand if we extend this model with something that gives a unit to x (like Real y=sin(x*time);
, then the equation will generate a unit-error.
model HelloWorld Real x(start = 1); equation der(x) = -x; end HelloWorld;
[...]
As far as I understand if we extend this model with something that gives a unit to x (like
Real y=sin(x*time);
, then the equation will generate a unit-error.
Can someone please educate me on why adding Real y = sin(x * time)
is "something that gives a unit to x"? In my (admittedly imperfect understanding of Modelica) x
will have unit = ""
and thus units may be derived by a tool, or not? Also y
has unit = ""
and thus I believe that we have one relevant equation with regard to units and two unknowns making this underspecified, or not?
Can someone please educate me on why adding
Real y = sin(x * time)
is "something that gives a unit to x"?
We are operating on the assumptions that time
carry unit "s"
and the expression given as argument to sin
is expected to have unit "1"
, which would mean that we deduce x
to have unit "1/s"
.
Can someone please educate me on why adding
Real y = sin(x * time)
is "something that gives a unit to x"?We are operating on the assumptions that
time
carry unit"s"
and the expression given as argument tosin
is expected to have unit"1"
, which would mean that we deducex
to have unit"1/s"
.
Yes, that is the unit equation I missed. Thank you. So it would likely only be unclear if we did Real y = sin(1 * x * time);
as the literal 1
will have unit = ""
as well?
Can someone please educate me on why adding
Real y = sin(x * time)
is "something that gives a unit to x"?We are operating on the assumptions that
time
carry unit"s"
and the expression given as argument tosin
is expected to have unit"1"
, which would mean that we deducex
to have unit"1/s"
.
To clarify - time carrying unit "s" is already stated: https://specification.modelica.org/master/operators-and-expressions.html#built-in-intrinsic-operators-with-function-syntax
That input given to sin has unit "1" is defined in this PR (transcendental functions). And unit for multiplication is also given in this proposal.
Yes, that is the unit equation I missed. Thank you. So it would likely only be unclear if we did
Real y = sin(1 * x * time);
as the literal1
will haveunit = ""
as well?
No, with this mcp, and this proposal in particular, literal in multiplicative operations would behave as if they had unit "1"
, to avoid them preventing any unit-checking.