Gradualizer icon indicating copy to clipboard operation
Gradualizer copied to clipboard

Type refinement in conditional expressions

Open gomoripeti opened this issue 5 years ago • 5 comments

(this ticket was split from https://github.com/josefs/Gradualizer/issues/23 which also contain some of the early comments)

let's see the following correct function with spec

-spec f(atom() | integer()) -> true | integer().
f(AI) ->
    if is_atom(AI) ->    %% Expr1
         true;
       true ->
          AI + 1         %% Expr2
    end.
  • the type of variable AI in Exrp2 must be the subtype of number() :: integer() | float() as the first argument of +
  • the type of variable AI in Exrp2 is atom() | integer() based on the spec which is not a subtype of integer() | float()

How could Gradualizer realise that Expr2 is only executed when type of AI is integer()?

  • if type checking was supersmart it would realise that if NOT is_atom(AI) => typeof(AI) == (atom() | integer()) & (not atom()) == integer() - however this is not always possible

Let's see a more generic example

-spec f(input()) -> any().
f(A) ->
  case cond(A) of
    true -> true;
    false -> body(A)
  end.

-spec cond(input()) -> boolean().
-spec body(required()) -> any().

In this case what the type checking could do is realise that body(A) is only executed in a subset of cases therefor type of A in body(A) is just a subset or subtype of input(). In case input() & required() == none() (the intersection of the two types is empty) then we can say that body(A) will always fail, but otherwise we cannot know (and have to leave type checking at runtime). This method requires to be able to calculate the intersection of two types.

What is the right way to handle these kind of constructs? (if, case, multiple function clause)

gomoripeti avatar Jul 22 '18 14:07 gomoripeti

(comments from @josefs copied fom #23)

Type refinement would be a very powerful feature to have. But it's also non-trivial to implement and it's unclear to me how important it is. So I think the right way forward is to first try and get the gradualizer into a state where people can use it. Perhaps a version 1.0. Then we can more easily see what features people need and miss and then start to tackle them in the right order.

I haven't spent very much time thinking about type refinement but if you want to dig into that subject I think that you should at the very least read Sums of Uncertainty: Refinements Go Gradual, a paper from the POPL conference last year. I haven't yet read the whole paper but I assume we will be able to borrow quite a lot from that paper.

gomoripeti avatar Jul 22 '18 14:07 gomoripeti

@gomoripeti, since you opened this ticket we've implemented type refinement. It doesn't handle the cases that you imagined though. Are you still keen on pursuing the kind of refinement that you outlined or should we let the machinery we have now suffice?

josefs avatar Jan 11 '20 17:01 josefs

The code for if clauses is already unified with the code handling case, function, receive, try-of and catch clauses (with an empty list as the patterns), so we get the type refinement implemented for the other cases also in if automatically.

The remaining part for the first example above is to implement refinement based on the is_TYPE guards for mismatching clauses, i.e. to subtract those types in certain cases. (This can only be done in a much more limited set of situations, since we need to be sure that the clause hasn't mismatched for some other reason. We must be able to "blame" the specific guard function for the mismatch.)

zuiderkwast avatar Jan 11 '20 18:01 zuiderkwast

(tbh I don't remember what was my original main idea, it looks like it might have been arguing for the introduction of "intersection") anyway as you both describe some common aspects are already implemented. and @zuiderkwast you point out, a full solution with "blaming" is quite complicated. I'm fine if Gradualizer does not warn in certain scenarios, more inconvenient if it errors on valid code (like the known problems added). But I'm fine if this is addressed step by step in a pragmatic way as real world examples come up.

gomoripeti avatar Jan 13 '20 22:01 gomoripeti

The existing function refine(Ty1, Ty2, TEnv) is actually set-difference, i.e. Ty1 \ Ty2 which is equivallent to Ty1 & (not Ty2) AFAIK.

Please look at my new PR! ^

zuiderkwast avatar Jan 13 '20 23:01 zuiderkwast