llst
llst copied to clipboard
Type inference prototype based on static analysis
This PR is my first rather independent (or naïve?) attempt to realize how type inference may be used to aid JIT VM. Underlying concept is not based on any well-known theory like Hindley-Milner type system or Martin Löf's Intuitionistic type theory. It's the result of a pure meditation on Smalltalk bytecodes.
Surprisingly enough, Smalltalk being fully dynamic in it's nature is still very regular in terms of it's memory access and control flow. This really helps when we try to perform static analysis.
Current implementation concentrates on the method temporaries and gives up completely when it faces object fields. However, I believe that in local context it is still possible to infer the object's fields to fully unlock further analysis and optimizations, like stack allocation, GC root elimination, and of course TBAA.
Current implementation is powerful enough to infer self assigning temporaries within a loop and even in complex closure contexts.
For example, the following method is inferred completely:
testInference |sum|
sum <- 0.
1 to: 100 do:
[ :x | sum <- sum + x ].
^sum
Here analyzer proves that sum
variable will have SmallInt
type in it's scope which spans across the follwing methods: Undefined>>testInference
, Number>>to:do:
, Block>>value:
and finally, the block Undefined>>testInference@8
. Integer overflow is currently undefined.
This allows IR generator to encode operations on sum
directly as i32
without any worries about GC or dynamic dispatch.
Current inference scheme highly uses method monomorphisation and specialization, which helps to perform calculations at compile time.
For example, this code is reduced to a single literal value (the result) at compile time:
fibonacci: n
n < 3 ifTrue: [ ^1 ].
^ (self fibonacci: n - 2) + (self fibonacci: n - 1)
Type inference works even in case of recurring contexts and correctly solves chicken or egg problem. Consider the listing of the Collection>>sort:
sort: criteria | left right mediane |
(self size < 2) ifTrue: [^self].
mediane <- self popFirst.
left <- List new.
right <- List new.
self do: [ :x |
(criteria value: x value: mediane)
ifTrue: [ left add: x ]
ifFalse: [ right add: x ] ].
left <- left sort: criteria.
right <- right sort: criteria.
right add: mediane.
^ left appendList: right
This is suboptimal implementation of the Quicksort algorithm used here only for testing purposes. It performs two recursive calls to Collection>>sort:
when sorting left
and right
sublists. The main difficulty here is to infer the return value of the Collection>>sort:
. For example, analysis of the left
refers to the outer context which at that point is not inferred completely, hence no return type available.
However, current implementation correctly solves the problem using a fact, that every recursion has it's base which should be evaluated prior to the next recursive call. By propagating the base return type from the outer context to the inner one we succeed in the whole process. See the log for an example of such inference. You may also see the resulting call graph as rendered svg or graphviz source.
Static analysis along with runtime statistics and polymorphic method cacheing provides enough information for effective dispatch of the Smalltalk code.
See also: #56, #58.
The implementation is not complete yet, there are a lot of things to do:
- [x] Implement inference of non-literal recursive messages, such as
Collection>>sort:
- [ ] Implement inference of composite types one per entry
- [ ] Detect block escaping and avoid inference of the captured variables
- [ ] Detect situations where inference will produce invalid results
- [ ] Detect correct number of passes for guaranteed inference
- [ ] Detect when method specialization is redundant and unnecessary
- [ ] Solve the situation when phi incoming maps to a node from a dead path
- [x] Type inference walker should use breadth-first traverse
- [ ] Analysis for block invocation kind in closure sites
- [ ] Inference across phi nodes
- [x] Memory management, block inference caches, …
- [ ] Implement inference of the remaining primitives
- [x] Use inferred types in actual code generation
- [ ] Take care of non-static objects
- [ ] Tests for all of this