language-cil
language-cil copied to clipboard
A monad for writing method bodies
The goal of making use of the library read as much like the resulting IL as possible, together with some issues around alpha-conversion, led me to this.
Instead of writing a method body as a [MethodDecl], I propose something like this change from:
ioAge :: MethodDef
ioAge = Method [MaStatic, MaPublic] Void "ioAge" []
[ maxStack 11
, localsInit
[ Local Int32 "x"
, Local (ValueType "mscorlib" "System.DateTime") "d1"
, Local (ValueType "mscorlib" "System.DateTime") "d2"
]
, ldstr "What year were you born?"
, call [] Void "mscorlib" "System.Console" "WriteLine" [String]
, call [] String "mscorlib" "System.Console" "ReadLine" []
, call [] Int32 "" "int32" "Parse" [String]
, stloc 0
, call [] (ValueType "mscorlib" "System.DateTime") "mscorlib" "System.DateTime" "get_Now" []
, stloc 1
, ldloca 1
, ldloc 0
, neg
, call [CcInstance] (ValueType "mscorlib" "System.DateTime") "mscorlib" "System.DateTime" "AddYears" [Int32]
, stloc 2
, ldstr "This year, you turn {0}."
, ldloca 2
, call [CcInstance] Int32 "mscorlib" "System.DateTime" "get_Year" []
, box Int32
, call [] Void "mscorlib" "System.Console" "WriteLine" [String, Object]
, ret
]
to:
dateTime = ValueType "mscorlib" "System.DateTime" -- just for abbreviation
ioAge :: MethodDef
ioAge = Method [MaStatic, MaPublic] Void "ioAge" []
$ do
maxStack 11 -- this stays for now until the analysis to do it is done
-- you could keep the same localsInit expression if you wanted to, this is just to demonstrate how it would be done in the case where the code generator is not in a position to assign all the names at once or isn't concerned with names
x <- freshLocal Int32
d1 <- freshLocal (dateTime)
d2 <- freshLocal (dateTime)
ldstr "What year were you born?"
call [] Void "mscorlib" "System.Console" "WriteLine" [String]
call [] String "mscorlib" "System.Console" "ReadLine" []
call [] Int32 "" "int32" "Parse" [String]
stloc x
call [] dateTime "mscorlib" "System.DateTime" "get_Now" []
stloc d1
ldloca d1
ldloc x
neg
call [CcInstance] dateTime "mscorlib" "System.DateTime" "AddYears" [Int32]
stloc d2
ldstr "This year, you turn {0}."
ldloca d2
call [CcInstance] Int32 "mscorlib" "System.DateTime" "get_Year" []
box Int32
call [] Void "mscorlib" "System.Console" "WriteLine" [String, Object]
ret
With appropriate supporting code, sketched below:
data MethodBuilderState = MState { instructions :: [MethodDecl], nextLocal :: Int, ... } -- I have a few ideas about the ..., omitted for simplicity
import Control.Monad.State
type MethodBuilder a = State MethodBuilderState a
initialState :: MethodBuilderState
initialState = MState { instructions = [], nextLocal = 0, ... }
buildMethod :: MethodBuilder a -> [MethodDecl]
buildMethod = runState initialState -- not quite, run the maxstack analysis and add the result, add the .localsinit for whatever freshLocals may have been made, etc., but you get the idea
freshLocal :: MethodBuilder Offset -- or could be a name with a possible unique-ifying suffix, or could be one variant for each
freshLocal = do
result <- getNextLocal
modify (\s -> s { nextLocal = result + 1 })
return result
append :: MethodDecl -> MethodBuilder ()
append instr = modify (\s -> s { instructions = (instructions s) ++ [instr] }) -- obviously it would be better for performance to store a backwards list of instructions since we only snoc and never cons, presented forwards to get the idea across
and with appropriate changes to the building methods, e.g. from:
bgt :: Label -> MethodDecl
bgt = mdecl . Bgt
to:
bot :: Label -> MethodBuilder ()
bot = append $ mdecl . Bgt
(Potentially for customizability you might want to define MethodBuilder as a class instead of a type, make append a function in that class, change the builder type of the example from Label -> MethodBuilder () to (MethodBuilder m) => Label -> m (), etc. This would allow people to use a more feature-rich method building monad if their backend had some reason to do so. Ignore that complexity for now.)
tl;dr The monadic version both looks a lot more like the IL than the [MethodDecl] version and provides a place to hang a few other things that practical code-generators are going to need.
Hmmm...
I've a bit of an aversion to monads, so I guess I'm biased. I've done a tiny bit of work with the GHC internals (20+ year old codebase) and worked extensively on UHC (5+ years old). Both of those use monad stacks internally, which I found very hard to use. I guess for people experienced with the code base its easier, but for someone not familiar with the code, it looks like a big, unnavigable, mess.
I really like the simplicity of the idea that a Method is a name, parameters and a list of instructions. Easy to understand, even for newcomers.
One of my main concerns is keeping the library simple for newcomers, I don't want to scare people off (which I personally have been from a couple of complicated looking hackage libraries).
However, you're probably right, that extra stuff like fresh name generation and automatic maxstack calculation is useful for more experienced users of the library. So adding some alternative way of building a [MethodDecl]
might be useful.
My first suggestion would be to add a module Language.Cil.Build.MethodBuilder
(probably needs a better name). That could be useful for people needing more complex build functionality and more automation. This could expose one or more functions that generate a [MethodDecl]
. Stuff like maxstack calculation could also be here.
Any thoughts?
Also, in the current design, it is probabily a good idea to add stlocNm :: LocalName -> OpCode
.
Then I'd implement the first example as: ioAge :: MethodDef ioAge = Method [MaStatic, MaPublic] Void "ioAge" [] [ localsInit [ Local Int32 x , Local (ValueType "mscorlib" "System.DateTime") d1 ] , call [] String "mscorlib" "System.Console" "ReadLine" [] , call [] Int32 "" "int32" "Parse" [String] , stlocNm x , call "mscorlib" "System.DateTime" "get_Now" [] , stlocNm d1 , ret ] where x :: LocalName x = "x" d1 :: LocalName d1 = "d1"
Not as useful for a code generator where you want to convenience of automatic unique names. But very useful when you want to generate nice looking IL (which I do, because I read it a lot).
Is that really simpler? I think the local declarations are a fair bit easier to read in the monadic version, and you don't have to manage the commas, but I may be mistaken.
We could do both as you suggest, but we would either need two copies of all the opcode-named building functions or we would need to introduce
class HasOpCodes m where
injectOpCode :: OpCode -> m
instance HasOpCodes OpCode where
injectOpCode = id
instance HasOpCodes MethodDecl where
injectOpCode = mdecl
instance HasOpCodes (MethodBuilder ()) where
injectOpCode = append . mdecl
I have absolutely no problem with the commas, this is no Lisp, but I'm sure every Haskell programmer knows how lists work. I think the conceptual overhead of the monad with the return, bind and (>>)
(what's that operator called?), is higher than the visual overhead of a couple of commas.
For now, I'm in favour of duplicating the opcode builder functions in the MethodBuilder module. I'd like to try out both versions in a practical setting. If it turns out they're both useful we can think of abstracting away the commonalities. If on the other hand the [MethodDecl]
version turns out to be unnecessary, those builders can be completely removed in favour of the MethodBuilder monad.
Ultimately its of course best if there is no duplication, but for now I think it best that the MethodBuilder module is an addition. An addition that can or can't be used without impacting the rest of the code. I don't want to fall in the trap of premature abstraction.
Sounds like a good plan. I have a side-by-side implementation that works. Later this evening I will do some renaming and pass it along.