crystal icon indicating copy to clipboard operation
crystal copied to clipboard

Using --no-codegen blocks --emit options

Open bew opened this issue 7 years ago • 11 comments

When we do: crystal build hello.cr --emit llvm-ir --no-codegen It'll not generate the hello.ll file even though it was asked for.

This is because the emit options are handled all-in-one-place in the CompilationUnit#emit method https://github.com/crystal-lang/crystal/blob/80cbe6603f938fc2200b785bd3fb2edbd854a17d/src/compiler/crystal/compiler.cr#L676-L687

Which is called in the codegen block: https://github.com/crystal-lang/crystal/blob/80cbe6603f938fc2200b785bd3fb2edbd854a17d/src/compiler/crystal/compiler.cr#L339-L353 At line 348.

I think it should emit what was asked for as soon as it get it, so the llvm-ir after it was built, the obj in the codegen phase, etc..

bew avatar Mar 13 '18 05:03 bew

This also means that there is no way to dump llvm IR when there is a module validation failed error, that we would like to debug.

bew avatar Apr 23 '18 19:04 bew

In order to debug #5972 I had to modify the compiler manually to disable module validation. So +1 to that!

straight-shoota avatar Apr 23 '18 20:04 straight-shoota

While looking into that, I noticed that all the codegen phases (mainly Crystal to LLVM IR and LLVM IR to BC+OBJ) are done in the codegen method. I suggest to separate the Crystal to LLVM IR codegen phase from the binary codegen (LLVM-IR to BC+OBJ), maybe name it IR codegen ?

Also the current emit options are in 2 categories:

  • after llvm ir generation: for llvm-ir (should be processed even on --no-codegen)
  • after binary codegen: for asm, llvm-bc, obj (not processed on --no-codegen)

Those are currently all handled and processed after binary codegen. By having the IR codegen separated, we could handle the different emit options at different time.

WDYT?

bew avatar Apr 23 '18 21:04 bew

Also, when doing --no-codegen this would disable all codegen ? or only binary codegen? or how to configure that?

Maybe if there is --emit llvm-ir, it would only disable binary codegen (IR codegen will still be done, so that the IR can be dumped), and when not given, all codegen would be disabled.

bew avatar Apr 23 '18 21:04 bew

What's so bad about not using --no-codegen when using --emit? No codegen means the codegen phase doesn't run, and this trivially means no artifacts (.ll, .s, .o, final executable) will be produced.

asterite avatar Apr 29 '18 15:04 asterite

@asterite True, these are all different stages of codegen. But what do you do if you want the .ll (or .s) but not .o?

--no-codegen --emit llvm-ir makes sense for this: It essentially says "skip codegen, but emit LLVM-IR" which can be interpreted as "do codegen but abort after LLVM-IR is dumped". There is no other logical interpretation of the combination of these two flags other than ignoring or failing when --emit is presented together with --no-codegen. But if it can be expressed that way, why shouldn't it be usable?

straight-shoota avatar Apr 29 '18 15:04 straight-shoota

True, these are all different stages of codegen. But what do you do if you want the .ll (or .s) but not .o?

You just wait a little more? :-)

It essentially says "skip codegen, but emit LLVM-IR" which can be interpreted as "do codegen but abort after LLVM-IR is dumped".

There's the confusion. Codegen means "run code to create the LLVM in memory". Emit comes after that. Of course we could change that, but I don't see the point. Just wait a few more seconds and you'll have it.

asterite avatar Apr 29 '18 15:04 asterite

Or, well, put another one, if someone wants to implement it, please send a PR (I won't)

asterite avatar Apr 29 '18 15:04 asterite

Also, there could be a --no-binary-codegen in addition to --no-codegen, where the latter disables all codegen (ir & binary), and the former disable the binary codegen only, but allow the ir codegen to be done.

I think it allows more control, and remove weird edge cases like "always do IR codegen when --emit is llvm-ir even on --no-codegen".

bew avatar Jun 03 '18 22:06 bew

For the Compiler Explorer it would be very useful if the assembly and the IR can both be generated without actually compiling / linking that into a binary or .o file. One reason is that link failures should be tolerated there unless the code is ultimately executed as requested by the user.

HertzDevil avatar Jul 13 '21 11:07 HertzDevil

Agree with HertzDevil above. It would be useful to be able to emit and inspect the IR without needing to link

mattrbeck avatar Aug 23 '22 16:08 mattrbeck