ModelicaSpecification icon indicating copy to clipboard operation
ModelicaSpecification copied to clipboard

Applying Hindley-Milner to unit-checking

Open qlambert-pro opened this issue 11 months ago • 55 comments

This is fairly complete proposal for unit-checking based on the Hindley-Milner inference algorithm. This corresponds to the implementation in Wolfram System Modeler.

qlambert-pro avatar Mar 19 '24 15:03 qlambert-pro

I will have to study it in more detail - and in particular would want to first consider just the definition of the unit-checking and then consider the variants for adding units.

For the unit-checking I noticed that several functions were not considered (but could trivially be added - Dymola already supported these checks except cat - and some part of array-handling):

  • cross - treated as multiplication
  • interval - return "s"
  • hold, sample, subSample, superSample, previous, fill, skew, noEvent, abs, pre, promote where the unit of the result corresponds to the unit of the first argument (and other arguments are ignored for unit-checking)
  • {}, [], array (and sort of scalar, vector, matrix) (assuming we don't scalarize first) where all arguments must have the same unit and the result then also has this unit
  • similarly for cat ignoring first argument
  • smooth where the unit corresponds to the unit of the second argument (and other arguments are ignored for unit-checking)
  • sqrt should be treated as power with rational exponent 1/2; I also find the general restrictions on the power operator somewhat lacking

And something for min, max, transpose, diagonal, sum, product.

Dymola also treats the different transcendental functions differently regarding "rad" vs. "1"; even though those differences are later ignored.

HansOlsson avatar Mar 20 '24 10:03 HansOlsson

It was a conscious choice not to include an exhaustive list of builtin functions as, as you said, they can be trivially added later if we agree on the base mechanism. Similarly, for arrays, previous attempt at standardizing unit-checking have focus on scalar expressions, but as mentioned in the document it can be generalized to element-wise operations on arrays. And we are confident that it will work with matrix operations too, but we don't have a proof of concept for that last part.

qlambert-pro avatar Mar 20 '24 10:03 qlambert-pro

Three comments on the proposal.

  • A number of examples would help explain a lot what it does, what are valid cases and what are not.
  • I would like to see any such scheme applied to some real-world libraries (not only MSL), and what the impact would be.
  • What does this proposal do that existing tools cannot do?

In general, I think this is the kind of extension that is best formalized through a layered standard, as there is not change for correct models.

DagBruck avatar Mar 21 '24 06:03 DagBruck

As far as I understand there are three parts to this (now that we added fractional powers in the unit-grammar):

  1. The unit-constraints (such that a,b,c must have the same unit in c=a+b;; and the more complicated rule for multiplication). They are important, and although fairly clear there are some minor issues to discuss regarding the details.
  2. How to handle unknown units and literals.
  3. That we solve for the unit of variables one at a time (as a I understand Hindley-Milner is just a smart way of doing that).

Regarding (1): Of which unit-constraints are active. There are some minor details of which unit-constraints are active that are important (only active equations?, and clearly allowing different units for different instances of the class corresponding to "let-polymorphism"). Note that these details are important if we are considering storing the deduced units in the model (only possible for the current model, and even then it can be problematic). Having some way of de-activating unit-checking per equation (or model, package) also seems needed in practice, but is just a minor detail.

Regarding (2): The handling of literals is the major new change; and originally Dymola's unit-check had literals as total wild-cards which wasn't good triggering this MCP, whereas requiring units for every literal would be too much work for users for too small a benefit.

Regarding (3): One alternative for solving for units one variable at a time is to view the unit-constraints as a large (over and under-constrained) system of equations in rational numbers; another is to have more ad-hoc rules (so that we have one set of rules for checking units and another for reducing - we would then need to ensure consistency etc). Having some restriction on when to deduce (like for (global) constants can still be added to the current idea.

To illustrate the alternative of system of equations consider two problematic equations:

k*der(x)=-x;
cos(y*z)+cos(y/z)=f(time)

Assuming no units are given the approach with solving for variable-units one at a time cannot find any units.

With unit-constraints as equations we would for the first equation have: unit.k+unit.x-[s]=unit.x which gives: unit.k=[s], and no constraint for unit.x

And for the second equation: unit.y+unit.z=[1] unit.y-unit.z=[1] which gives: unit.y=[1], unit.z=[1].

Obviously this alternative would be more powerful, but:

  • I'm not sure exactly how to efficiently implement the equation solving (well, just use a sparse solver...), especially considering the next item:
  • Diagnostics will be harder to understand when it works; instead of saying that we got the unit of x from expression x...y where y had unit ... we now get units for multiple variables from multiple expressions, making it harder to understand
  • Diagnostics when it fails will be harder still (will be discussed later)
  • I don't know if there are realistic use-cases where it will give a significant gain. The goal isn't that people write large models with 100+ variables and equations and then add the unit for one variable and everything is magically handled, but more that perhaps 80% of variables have unit and the algorithm checks for mistakes and fills in the rest.

Regarding diagnostics when it fails I would more say it is a matter of finding the possible problematic equations, and in my experience analyzing that even for a handful of equations takes care - and domain knowledge; whereas this other method seem designed for a large number of equations.

HansOlsson avatar Mar 21 '24 08:03 HansOlsson

For the power operator I would just split it based on the type of the exponent:

  • Integer: Integer-power even if not evaluated
  • Real: Treat the same way as: exp(ln(x)*y); - so both must have unit="1".

HansOlsson avatar Mar 21 '24 10:03 HansOlsson

For the power operator I would just split it based on the type of the exponent:

* Integer: Integer-power even if not evaluated

What would be the unit of y in the following models then?

model Test
  Real x(unit = "m");
  Integer i;
  Real y = x^i;
initial equation
  i = 2;
equation
  when (time>0.4) then
    i = 3;
  end when;
end Test;

model Test2
  Real x(unit = "m");
  parameter Integer i = 1;
  Real y = x^i;
end Test2;

For context in the current state of the proposal, we would consider both models invalid because x doesn't have unit "1".

qlambert-pro avatar Mar 21 '24 11:03 qlambert-pro

@qlambert-pro I would consider your example an ill-conditioned model without physical meaning and rewrite it:

model Test
  Real x(unit = "m");
  constant Real x0(unit = "m")=1;
  Integer i;
  Real y = (x/x0)^i;
initial equation
  i = 2;
equation
  when (time>0.4) then
    i = 3;
  end when;
end Test;

AHaumer avatar Mar 21 '24 14:03 AHaumer

@qlambert-pro I would consider your example an ill-conditioned model without physical meaning and rewrite it:

model Test
  Real x(unit = "m");
  constant Real x0(unit = "m")=1;
  Integer i;
  Real y = (x/x0)^i;
initial equation
  i = 2;
equation
  when (time>0.4) then
    i = 3;
  end when;
end Test;

I can see this - and it would be similar to figuring out the unit for Real z=if time>=0.4 then x else x*x;

HansOlsson avatar Mar 21 '24 16:03 HansOlsson

@qlambert-pro I would consider your example an ill-conditioned model without physical meaning and rewrite it:

model Test
  Real x(unit = "m");
  constant Real x0(unit = "m")=1;
  Integer i;
  Real y = (x/x0)^i;
initial equation
  i = 2;
equation
  when (time>0.4) then
    i = 3;
  end when;
end Test;

I can see this - and it would be similar to figuring out the unit for Real z=if time>=0.4 then x else x*x;

This all makes probably little sense physically, but for cybernetic model parts I wonder how to easily lookup a magnitude with implicit unit = "1"? It looks cumbersome to have to use some constant with identical units to make expressions dimensionsless quickly, if you know what you are doing.

gwr69 avatar Mar 21 '24 17:03 gwr69

@gwr69 I'm an engineer, to me it's not cumbersome. I prepare my equations from textbook or from my own research before I start modeling. As long as equations are not meeting in units they're bad and not fit for modeling. In some cases - even from a physical point of view - you have to use dimensionless variables, i.e x/x_ref. In many cases x_ref has a meaningful value different from1.

AHaumer avatar Mar 21 '24 17:03 AHaumer

@gwr69 I'm an engineer, to me it's not cumbersome. I prepare my equations from textbook or from my own research before I start modeling. As long as equations are not meeting in units they're bad and not fit for modeling. In some cases - even from a physical point of view - you have to use dimensionless variables, i.e x/x_ref. In many cases x_ref has a meaningful value different from1.

@AHaumer I always envy the engineers and how they use 7 base units for everything. Coming from economics and business modeling, I would certainly like to apply the rigorous procedures and unit checking that you describe—and to which I whole heartedly agree—to more abstract fields than engineering. Unfortunately, there are no monetary units in Modelica and units of time typically end with the "day" ignoring that even in scientific realms years are frequently used, even though they are typically variable units of time.

Of course, modeling in those abstract and "soft" realms will appear simple, if you only allow yourself the use of 1 for counting. But how to build models for predator-prey dynamics where the nonlinear terms call for dimensionless rates per predator or per prey and give them proper units different from 1 as there may quickly be equation errors if you neglect those differences? This is what I meant with "cumbersome" and sometimes one may need to strike a careful balance.

gwr69 avatar Mar 21 '24 18:03 gwr69

I wonder how to easily lookup a magnitude with implicit unit = "1"

This proposal helpfully provides an operator just for that purpose: withoutUnit(expr, "unit") It outputs the magnitude of expr after conversion from its unit to "unit". Which still provides you with unit-safety, but allows you to slip in a unit less world when you need it.

qlambert-pro avatar Mar 22 '24 06:03 qlambert-pro

I can see this - and it would be similar to figuring out the unit for Real z=if time>=0.4 then x else x*x;

Which this proposal treat a similar way, it requires both side of an if whose condition hasn't been evaluated to have the same unit.

qlambert-pro avatar Mar 22 '24 06:03 qlambert-pro

I find it sub-optimal if a particular algorithm ("Hindley Midler") is required in the specification: To my knowledge, a concrete algorithm was never required in the Modelica Specification, simply because there might be different algorithms with pros and cons and the tool vendor should decide which algorithm to use.

Please, add examples to the proposed specification, as already mentioned.

If better unit-checking is introduced, it must be introduced in a way, that existing models can still be compiled and simulated, otherwise, people will be very upset. From a tool perspective, there should be an option to either ignore unit checking, give a warning or an error (giving a warning as a default). From my experience with Julia this is really essential, because very strict unit checking (as in Julia) can be a complete nightmare for a modeler who just wants to quickly produce a result and does not have time to search for a unit issue. The question is, how to define this in the specification, in order that unit checking is optional.

MartinOtter avatar Mar 22 '24 08:03 MartinOtter

As far as I understand there are three parts to this (now that we added fractional powers in the unit-grammar):

1. The unit-constraints (such that a,b,c must have the same unit in `c=a+b;`; and the more complicated rule for multiplication). They are important, and although fairly clear there are some minor issues to discuss regarding the details.

2. How to handle unknown units and literals.

3. That we solve for the unit of variables one at a time (as a I understand Hindley-Milner is just a smart way of doing that).

... Regarding (3): One alternative for solving for units one variable at a time is to view the unit-constraints as a large (over and under-constrained) system of equations in rational numbers; another is to have more ad-hoc rules (so that we have one set of rules for checking units and another for reducing - we would then need to ensure consistency etc). Having some restriction on when to deduce (like for (global) constants can still be added to the current idea.

To illustrate the alternative of system of equations consider two problematic equations:

k*der(x)=-x;
cos(y*z)+cos(y/z)=f(time)

Assuming no units are given the approach with solving for variable-units one at a time cannot find any units.

With unit-constraints as equations we would for the first equation have: unit.k+unit.x-[s]=unit.x which gives: unit.k=[s], and no constraint for unit.x

And for the second equation: unit.y+unit.z=[1] unit.y-unit.z=[1] which gives: unit.y=[1], unit.z=[1].

Obviously this alternative would be more powerful, but:

* I'm not sure exactly how to efficiently implement the equation solving (well, just use a sparse solver...), especially considering the next item:

* Diagnostics will be harder to understand when it works; instead of saying that we got the unit of x from expression x...y where y had unit ... we now get units for multiple variables from multiple expressions, making it harder to understand

* Diagnostics when it fails will be harder still (will be discussed later)

* I don't know if there are realistic use-cases where it will give a significant gain. The goal isn't that people write large models with 100+ variables and equations and then add the unit for one variable and everything is magically handled, but more that perhaps 80% of variables have unit and the algorithm checks for mistakes and fills in the rest.

Regarding diagnostics when it fails I would more say it is a matter of finding the possible problematic equations, and in my experience analyzing that even for a handful of equations takes care - and domain knowledge; whereas this other method seem designed for a large number of equations.

I think the answer is simple from your analysis: Use only the units for one variable at a time, because simpler and better diagnostics. If the user is not satisfied with unit propagation, he just has to add more units at some places. Its much more critical if it is hard to figure out why a tool has selected a unit based on the analysis of many equations

MartinOtter avatar Mar 22 '24 08:03 MartinOtter

I find it sub-optimal if a particular algorithm ("Hindley Midler") is required in the specification: To my knowledge, a concrete algorithm was never required in the Modelica Specification, simply because there might be different algorithms with pros and cons and the tool vendor should decide which algorithm to use.

Making the Hindley-Milner algorithm a requirement of the specification was never our intention, but it relates to a state of the art type-system, which shares a lot of properties with the way we want unit to work in Modelica. We felt the need to spell it out in this document for two reasons:

  • To demystify the way it works and to allow us to talk about constraint while being clear about this constraints lead to unit inference and checking
  • Because the type system in its usual form is, effectively, impaired to allow models without units (e.g. HelloWorld) to still work, and this comes with notable difference.

qlambert-pro avatar Mar 22 '24 08:03 qlambert-pro

I find it sub-optimal if a particular algorithm ("Hindley Midler") is required in the specification: To my knowledge, a concrete algorithm was never required in the Modelica Specification, simply because there might be different algorithms with pros and cons and the tool vendor should decide which algorithm to use.

Making the Hindley-Milner algorithm a requirement of the specification, but it relates to a state of the art type-system, which share a lot of properties with the way we want unit to work in Modelica. We felt the need to spell it out in this document for two reasons:

  • To demystify the way it works and to allow us to talk about constraint while being clear about this constraints lead to unit inference and checking
  • Because the type system in its usual form is, effectively, impaired to allow models without units (e.g. HelloWorld) to still work, and this comes with notable difference.

But consider the following alternative (not efficient):

  • Do
    • Go through all unit-constraints in some order (using the same rules to deduce units from individual equations/expressions). If we can uniquely deduce the unit for one variable (or one expression) based on one equation/sub-expression do that.
  • While (some progress)

Will this give the same result? (Except for error diagnostics when it fails.)

HansOlsson avatar Mar 22 '24 08:03 HansOlsson

If better unit-checking is introduced,

There is no such thing as unit-checking in the specs. This is literally what this MCP is about.

qlambert-pro avatar Mar 22 '24 09:03 qlambert-pro

I think the answer is simple from your analysis: Use only the units for one variable at a time, because simpler and better diagnostics. If the user is not satisfied with unit propagation, he just has to add more units at some places. Its much more critical if it is hard to figure out why a tool has selected a unit based on the analysis of many equations

One could also write: If the user is not satisfied with unit diagnostics, they just have to add more units at some places. Its much more critical if it is hard to know in what unit the source of a signal is expressed, as it can lead to silent numerical errors.

qlambert-pro avatar Mar 22 '24 09:03 qlambert-pro

Will this give the same result? (Except for error diagnostics when it fails.)

I think the place it may differ is for models like HelloWorld. Where, a sensible unit-system would find an error, but the Modelica community wants it to be valid. Which we have found to add complexity, and justified us spelling the algorithm out.

qlambert-pro avatar Mar 22 '24 09:03 qlambert-pro

Will this give the same result? (Except for error diagnostics when it fails.)

I think the place it may differ is for models like HelloWorld. Where, a sensible unit-system would find an error, but the Modelica community wants it to be valid. Which we have found to add complexity, and justified us spelling the algorithm out.

I don't see how it would differ (I assume HelloWorld means a simple k*der(x)=-x; model where we don't derive unit for k?). I understand that solving it as a system of equations would give additional units, but I don't see where this simple algorithm would differ from Hindley-Milner - and I believe we need to understand this.

HansOlsson avatar Mar 22 '24 09:03 HansOlsson

Will this give the same result? (Except for error diagnostics when it fails.)

I think the place it may differ is for models like HelloWorld. Where, a sensible unit-system would find an error, but the Modelica community wants it to be valid. Which we have found to add complexity, and justified us spelling the algorithm out.

I don't see how it would differ (I assume HelloWorld means a simple k*der(x)=-x; model where we don't derive unit for k?). I understand that solving it as a system of equations would give additional units, but I don't see where this simple algorithm would differ from Hindley-Milner - and I believe we need to understand this.

Great example! Make it der(x) = k * x for the economists and all of a sudden nobody would want to enter a magnitude matching 1/s and entering the value from your broker/bank will get you in numerical trouble soon. ;-)

What took getting used to for me (the BusinessSimulation as of v2.2.0-wsm passes all unit checks in WSM without abuse of “fudge factors”, e.g. 1[“”]) is that I would expect that users know what they enter (input and parameters and constants), but they can only have expectations regarding the results (output) as there may be equation errors hidden in a model. Setting units for an ouput connector is helpful as it makes unit checks fail earlier. At the same time it is also confusing as the unit values are not seen as expectations but as definite values to “solve unit equations.” Finding the location of a unit error still is not easy imho.

gwr69 avatar Mar 22 '24 09:03 gwr69

I don't see how it would differ (I assume HelloWorld means a simple k*der(x)=-x; model where we don't derive unit for k?). I understand that solving it as a system of equations would give additional units, but I don't see where this simple algorithm would differ from Hindley-Milner - and I believe we need to understand this.

So I don't think they would necessarily differ from Hindley-Milner. Although, the details contained in "If we can uniquely deduce the unit for one variable" can mean that a tool is able to infer something that another isn't but there is an inconsistency in the way a variable is used. However, the version that is presented here support the following model for instance:

model HelloWorld
  Real x(start = 1);
equation
  der(x) = -x;
end HelloWorld;

on the basis that if a user doesn't set units for their variables they probably aren't interested in getting unit diagnostics. It was my understanding that people like Martin would probably would like that to be the case. But if I misunderstood, I am more than happy to consider that an error. It will simplify the proposal a lot.

qlambert-pro avatar Mar 22 '24 09:03 qlambert-pro

Finding the location of a unit error still is not easy imho.

I think that this is a tool issue, on our side we are both thinking of ways to make the derivation more intuitive as well as showing the user how we reached our conclusions.

qlambert-pro avatar Mar 22 '24 09:03 qlambert-pro

I don't see how it would differ (I assume HelloWorld means a simple k*der(x)=-x; model where we don't derive unit for k?). I understand that solving it as a system of equations would give additional units, but I don't see where this simple algorithm would differ from Hindley-Milner - and I believe we need to understand this.

So I don't think they would necessarily differ from Hindley-Milner. Although, the details contained in "If we can uniquely deduce the unit for one variable" can mean that a tool is able to infer something that another isn't but there is an inconsistency in the way a variable is used.

If we can agree on the unit-constraints I don't see how that could occur.

On the other hand the problem with describing unit-inference in terms of Hindley-Milner (or algorithm W - or some other letter) is that it doesn't fully answer all questions, in particular:

  • That algorithm is described in terms of type inference in functional languages; so one has to describe a mapping between units and types in functional languages. There are also basic parts of the algorithm that seem needlessly complex for unit-inference, e.g., unify.
  • The consequences of the unit-inference based on this algorithm doesn't seem clear. As far as I understand it took more than 10 years to prove that the algorithm actually generated the optimal result (in terms of extracting all type information) for the type inference case. Assuming that it generates the result I think it does for unit-inference we already know that the result isn't "optimal" (in the sense of extracting all unit information, but as previously explained I don't think the "optimal" result is good for Modelica models).

As far as I understand the actual algorithm is shorter than the text above.

  • However, the version that is presented here support the following model for instance:
model HelloWorld
  Real x(start = 1);
equation
  der(x) = -x;
end HelloWorld;

on the basis that if a user doesn't set units for their variables they probably aren't interested in getting unit diagnostics. It was my understanding that people like Martin would probably would like that to be the case. But if I misunderstood, I am more than happy to consider that an error. It will simplify the proposal a lot.

As far as I understand if we extend this model with something that gives a unit to x (like Real y=sin(x*time);, then the equation will generate a unit-error.

HansOlsson avatar Mar 25 '24 09:03 HansOlsson

model HelloWorld
  Real x(start = 1);
equation
  der(x) = -x;
end HelloWorld;

[...]

As far as I understand if we extend this model with something that gives a unit to x (like Real y=sin(x*time);, then the equation will generate a unit-error.

Can someone please educate me on why adding Real y = sin(x * time) is "something that gives a unit to x"? In my (admittedly imperfect understanding of Modelica) x will have unit = "" and thus units may be derived by a tool, or not? Also y has unit = "" and thus I believe that we have one relevant equation with regard to units and two unknowns making this underspecified, or not?

gwr69 avatar Mar 25 '24 10:03 gwr69

Can someone please educate me on why adding Real y = sin(x * time) is "something that gives a unit to x"?

We are operating on the assumptions that time carry unit "s" and the expression given as argument to sin is expected to have unit "1", which would mean that we deduce x to have unit "1/s".

qlambert-pro avatar Mar 25 '24 10:03 qlambert-pro

Can someone please educate me on why adding Real y = sin(x * time) is "something that gives a unit to x"?

We are operating on the assumptions that time carry unit "s" and the expression given as argument to sin is expected to have unit "1", which would mean that we deduce x to have unit "1/s".

Yes, that is the unit equation I missed. Thank you. So it would likely only be unclear if we did Real y = sin(1 * x * time); as the literal 1 will have unit = "" as well?

gwr69 avatar Mar 25 '24 10:03 gwr69

Can someone please educate me on why adding Real y = sin(x * time) is "something that gives a unit to x"?

We are operating on the assumptions that time carry unit "s" and the expression given as argument to sin is expected to have unit "1", which would mean that we deduce x to have unit "1/s".

To clarify - time carrying unit "s" is already stated: https://specification.modelica.org/master/operators-and-expressions.html#built-in-intrinsic-operators-with-function-syntax

That input given to sin has unit "1" is defined in this PR (transcendental functions). And unit for multiplication is also given in this proposal.

HansOlsson avatar Mar 25 '24 10:03 HansOlsson

Yes, that is the unit equation I missed. Thank you. So it would likely only be unclear if we did Real y = sin(1 * x * time); as the literal 1 will have unit = "" as well?

No, with this mcp, and this proposal in particular, literal in multiplicative operations would behave as if they had unit "1", to avoid them preventing any unit-checking.

qlambert-pro avatar Mar 25 '24 10:03 qlambert-pro