Gradualizer icon indicating copy to clipboard operation
Gradualizer copied to clipboard

Refactoring typechecker module

Open zuiderkwast opened this issue 6 years ago • 7 comments

  • Type checking this module takes forever. There's a bottle-neck somewhere.

  • There is not clear separation of concern. Modularization, so we can use put helpers as unexported functions in separate modules, reduces the number of functions in the same scope.

    Obvious candidates for separate modules:

    • Error message formatting
    • The subtyping relation, normalizing, glb
    • Integer range handling
  • Documentation, how about generating edoc comments

  • Renaming the main AST traversal functions. I suggest the basing the names on the following words from the papers on bidirectional type checking:

    1. Checking an expression against an expected type
    2. Synthesizing a type from an expression
  • Folding unions earlier in op_type_check_expr_in

zuiderkwast avatar Oct 20 '19 22:10 zuiderkwast

About the moving stuff to different modules, I sketched this quick refactor yesterday: https://github.com/NelsonVides/Gradualizer/tree/typechecker_refactor Not saying it's good, but it kinda follows your suggestion: error messages to one module, subtyping to another, normalize and glb to two others, integer range handling to another... I'm into that kind of modularization.

NelsonVides avatar Oct 21 '19 07:10 NelsonVides

No time to do a careful review now, but just a few comments.

Because of Erlang has a global namespace for modules, we should prefix the modules with gradualizer_. Only a few modules are not following this convention now. I suggest relatively short names such as gradualizer_fmt (for formatting and printing messages), gradualizer_gbl, etc.

Also regarding the relation between LUB, GLB and normalize: LUB is taking the union of two types and then normalizing it. GBL is taking the intersection and normalizing it. Normalize is not a synonym of LUB.

(In the future, it would be nice if GLB can be just implemented as glb(Ty1, Ty2) -> normalize({type, 0, intersection, [Ty1, Ty2]}), but that's out of scope for this module split.)

zuiderkwast avatar Oct 21 '19 09:10 zuiderkwast

Here. That's how typechecker looks like on a graph. When @erszcz showed me that diagram without telling me what was it, for a moment I thought that his screen was just seriously scratched. Then he zoomed in 😂

typechecker-min

I even had to compress the image because the generated one was too big for github to let me upload it.

@zuiderkwast the module prefix, totally agree. My refactor was just some quick vim tricks to cut&paste text rapidly. But about the relationship between LUB, GLB, and normalize, I think I first need to understand what are you calling intersection and union, as @josefs asked you here https://github.com/josefs/Gradualizer/pull/206#discussion_r336796971

NelsonVides avatar Oct 21 '19 10:10 NelsonVides

Beautiful picture!

Union and intersection for static types is quite simple.

Union is very familiar in Erlang, e.g. integer() | atom(). For intersection, in Erlang, the only syntax for it is in multi-clause function specs, e.g. -spec f(integer()) -> atom(); (atom()) -> integer(). A reference to this function F = fun f/1, using our extension & for intersection, has type F :: fun((integer()) -> atom()) & fun((atom()) -> integer()), i.e. it has both types fun((integer()) -> atom()) and fun((atom()) -> integer()).

Union and intersection with any() is the less clear part, where gradual typing comes into the picture. We don't need to go deeper into that right now, perhaps.

zuiderkwast avatar Oct 21 '19 12:10 zuiderkwast

I'm all for the suggestions here @zuiderkwast . It's good to see this thing happening already.

@NelsonVides, I don't think the call graph can be that much improved by refactoring. This is how the callgraph of compilers and typecheckers look. Though I absolutely agree that there are improvements that can be made to the code base.

josefs avatar Oct 24 '19 19:10 josefs

I've had a look at "Folding unions earlier in do_type_check_expr_in" to be able to get rid of the scattered handling of unions in the "expect" functions. I've found that it is not as easy as I thought, maybe not even possible, because many of the clauses use subtyping to check ResTy or propagate ResTy into sub-expressions. These include list expressions such as cons. Therefore, it seems not possible to get rid of union handling in the expect functions. (Note that in add_type_pat, the unfolding happens in one place. Here it works.)

zuiderkwast avatar Oct 24 '19 22:10 zuiderkwast

We should be able to reuse the code for dealing with unions in the expect_*_type family of functions. That'd be a good improvement.

josefs avatar Oct 25 '19 09:10 josefs