Using --no-codegen blocks --emit options
When we do: crystal build hello.cr --emit llvm-ir --no-codegen
It'll not generate the hello.ll file even though it was asked for.
This is because the emit options are handled all-in-one-place in the CompilationUnit#emit method
https://github.com/crystal-lang/crystal/blob/80cbe6603f938fc2200b785bd3fb2edbd854a17d/src/compiler/crystal/compiler.cr#L676-L687
Which is called in the codegen block:
https://github.com/crystal-lang/crystal/blob/80cbe6603f938fc2200b785bd3fb2edbd854a17d/src/compiler/crystal/compiler.cr#L339-L353
At line 348.
I think it should emit what was asked for as soon as it get it, so the llvm-ir after it was built, the obj in the codegen phase, etc..
This also means that there is no way to dump llvm IR when there is a module validation failed error, that we would like to debug.
In order to debug #5972 I had to modify the compiler manually to disable module validation. So +1 to that!
While looking into that, I noticed that all the codegen phases (mainly Crystal to LLVM IR and LLVM IR to BC+OBJ) are done in the codegen method.
I suggest to separate the Crystal to LLVM IR codegen phase from the binary codegen (LLVM-IR to BC+OBJ), maybe name it IR codegen ?
Also the current emit options are in 2 categories:
- after llvm ir generation: for
llvm-ir(should be processed even on--no-codegen) - after binary codegen: for
asm,llvm-bc,obj(not processed on--no-codegen)
Those are currently all handled and processed after binary codegen. By having the IR codegen separated, we could handle the different emit options at different time.
WDYT?
Also, when doing --no-codegen this would disable all codegen ? or only binary codegen? or how to configure that?
Maybe if there is --emit llvm-ir, it would only disable binary codegen (IR codegen will still be done, so that the IR can be dumped), and when not given, all codegen would be disabled.
What's so bad about not using --no-codegen when using --emit? No codegen means the codegen phase doesn't run, and this trivially means no artifacts (.ll, .s, .o, final executable) will be produced.
@asterite True, these are all different stages of codegen. But what do you do if you want the .ll (or .s) but not .o?
--no-codegen --emit llvm-ir makes sense for this: It essentially says "skip codegen, but emit LLVM-IR" which can be interpreted as "do codegen but abort after LLVM-IR is dumped". There is no other logical interpretation of the combination of these two flags other than ignoring or failing when --emit is presented together with --no-codegen. But if it can be expressed that way, why shouldn't it be usable?
True, these are all different stages of codegen. But what do you do if you want the .ll (or .s) but not .o?
You just wait a little more? :-)
It essentially says "skip codegen, but emit LLVM-IR" which can be interpreted as "do codegen but abort after LLVM-IR is dumped".
There's the confusion. Codegen means "run code to create the LLVM in memory". Emit comes after that. Of course we could change that, but I don't see the point. Just wait a few more seconds and you'll have it.
Or, well, put another one, if someone wants to implement it, please send a PR (I won't)
Also, there could be a --no-binary-codegen in addition to --no-codegen, where the latter disables all codegen (ir & binary), and the former disable the binary codegen only, but allow the ir codegen to be done.
I think it allows more control, and remove weird edge cases like "always do IR codegen when --emit is llvm-ir even on --no-codegen".
For the Compiler Explorer it would be very useful if the assembly and the IR can both be generated without actually compiling / linking that into a binary or .o file. One reason is that link failures should be tolerated there unless the code is ultimately executed as requested by the user.
Agree with HertzDevil above. It would be useful to be able to emit and inspect the IR without needing to link