Refactoring typechecker module
-
Type checking this module takes forever. There's a bottle-neck somewhere.
-
There is not clear separation of concern. Modularization, so we can use put helpers as unexported functions in separate modules, reduces the number of functions in the same scope.
Obvious candidates for separate modules:
- Error message formatting
- The subtyping relation, normalizing, glb
- Integer range handling
-
Documentation, how about generating edoc comments
-
Renaming the main AST traversal functions. I suggest the basing the names on the following words from the papers on bidirectional type checking:
- Checking an expression against an expected type
- Synthesizing a type from an expression
-
Folding unions earlier in
op_type_check_expr_in
About the moving stuff to different modules, I sketched this quick refactor yesterday: https://github.com/NelsonVides/Gradualizer/tree/typechecker_refactor Not saying it's good, but it kinda follows your suggestion: error messages to one module, subtyping to another, normalize and glb to two others, integer range handling to another... I'm into that kind of modularization.
No time to do a careful review now, but just a few comments.
Because of Erlang has a global namespace for modules, we should prefix the modules with gradualizer_. Only a few modules are not following this convention now. I suggest relatively short names such as gradualizer_fmt (for formatting and printing messages), gradualizer_gbl, etc.
Also regarding the relation between LUB, GLB and normalize: LUB is taking the union of two types and then normalizing it. GBL is taking the intersection and normalizing it. Normalize is not a synonym of LUB.
(In the future, it would be nice if GLB can be just implemented as glb(Ty1, Ty2) -> normalize({type, 0, intersection, [Ty1, Ty2]}), but that's out of scope for this module split.)
Here. That's how typechecker looks like on a graph. When @erszcz showed me that diagram without telling me what was it, for a moment I thought that his screen was just seriously scratched. Then he zoomed in 😂

I even had to compress the image because the generated one was too big for github to let me upload it.
@zuiderkwast the module prefix, totally agree. My refactor was just some quick vim tricks to cut&paste text rapidly. But about the relationship between LUB, GLB, and normalize, I think I first need to understand what are you calling intersection and union, as @josefs asked you here https://github.com/josefs/Gradualizer/pull/206#discussion_r336796971
Beautiful picture!
Union and intersection for static types is quite simple.
Union is very familiar in Erlang, e.g. integer() | atom(). For intersection, in Erlang, the only syntax for it is in multi-clause function specs, e.g. -spec f(integer()) -> atom(); (atom()) -> integer(). A reference to this function F = fun f/1, using our extension & for intersection, has type F :: fun((integer()) -> atom()) & fun((atom()) -> integer()), i.e. it has both types fun((integer()) -> atom()) and fun((atom()) -> integer()).
Union and intersection with any() is the less clear part, where gradual typing comes into the picture. We don't need to go deeper into that right now, perhaps.
I'm all for the suggestions here @zuiderkwast . It's good to see this thing happening already.
@NelsonVides, I don't think the call graph can be that much improved by refactoring. This is how the callgraph of compilers and typecheckers look. Though I absolutely agree that there are improvements that can be made to the code base.
I've had a look at "Folding unions earlier in do_type_check_expr_in" to be able to get rid of the scattered handling of unions in the "expect" functions. I've found that it is not as easy as I thought, maybe not even possible, because many of the clauses use subtyping to check ResTy or propagate ResTy into sub-expressions. These include list expressions such as cons. Therefore, it seems not possible to get rid of union handling in the expect functions. (Note that in add_type_pat, the unfolding happens in one place. Here it works.)
We should be able to reuse the code for dealing with unions in the expect_*_type family of functions. That'd be a good improvement.