fslang-suggestions icon indicating copy to clipboard operation
fslang-suggestions copied to clipboard

Support Source Generators

Open praeclarum opened this issue 4 years ago • 71 comments

Support Source Generators

Add support similar to C# Source Generators

The idea is to execute the compiler in two passes:

  1. Pass 1 Parse and type check the project code (the type check may be optional as it will contain errors)
  2. Send that information to Source Generators that output new code files or syntax trees
  3. Pass 2 Combine all code, type check, emit

The existing ways of approaching this problem in F# are:

  1. TypeProviders which take specialized knowledge to author.
  2. Custom build steps to emit code

Pros and Cons

The advantages of making this adjustment to F# are an easy form of meta programming. It's basically all the benefits of type providers without the complexity.

The disadvantages are the repetition of a feature and the compiler performance penalty of executing the type checker twice when this feature is used.

Extra information

Estimated cost (XS, S, M, L, XL, XXL): M (depending on what data is passed to the generators)

Affidavit (please submit!)

Please tick this by placing a cross in the box:

  • [x] This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
  • [x] I have searched both open and closed suggestions on this site and believe this is not a duplicate
  • [x] This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

  • [x] This is not a breaking change to the F# language design
  • [x] I or my company would be willing to help implement and/or test this

praeclarum avatar Apr 29 '20 22:04 praeclarum

You can also write an F# script to generate an F# file today.

Happypig375 avatar Apr 30 '20 02:04 Happypig375

It's basically all the benefits of type providers without the complexity.

A huge part of type providers is design-time support. Based on a quick look, source generators don't seem to be anything like that; more akin to/just an evolution of T4. I'd personally rather see https://github.com/fsharp/fslang-design/blob/master/RFCs/FS-1023-type-providers-generate-types-from-types.md than this.

kerams avatar Apr 30 '20 08:04 kerams

@kerams everybody seems to agree that type provider are enormously brittle (and that seems to be an understatement still) so having a standard, integrated, lightweight source generation facility would be more than welcome.

robkuz avatar Apr 30 '20 09:04 robkuz

</>

Krzysztof-Cieslak avatar Apr 30 '20 10:04 Krzysztof-Cieslak

I think its worth targetting mid-level IR/AST representation rather than IL or just generating source code directly.

Swoorup avatar Apr 30 '20 12:04 Swoorup

Completely agree. Myriad is a better way of handling all this.

realvictorprm avatar Apr 30 '20 13:04 realvictorprm

@Krzysztof-Cieslak I think what this suggestion wants is some form of generating code or types that is better than the current state of type providers. I agree that generating strings is not the best approach.

Myriad can be in the mix of things to discuss here, as is an upgrade to type providers, as is something similar to C# source generators. If there is any link to a high-level write-up of Myriad it would be useful so it can be considered. I can't find anything online apart from this blog about its development process.

charlesroddie avatar Apr 30 '20 13:04 charlesroddie

@kerams the Source Generators feature is currently in a very early preview. There's extensive design-time support planned, which you can read the beginnings of here: https://github.com/dotnet/roslyn/blob/master/docs/features/source-generators.md#ide-integration.

There's some interesting challenges to solve that are not unlike what Type Providers struggle with today. For example, you want generated source to be up to date, so the safest thing to do is regenerate on every keystroke. This is effectively what Type Provider do today since they're asked to provide "fresh" types whenever the language service needs to re-typecheck things. The upside is correctness, but the downside is a huge hit to design-time performance. We somewhat work around this today in Type Providers with a series of caches that were added in the VS 2019 16.0/16.1 timeframe. They also serve as a band-aid around some more fundamental architectural flaws that force the TPSDK to hunt for and load big binaries in memory (when the compiler already has all the data it's looking for), leading to large Large Object Heap (LOH) allocations that ultimately kill IDE perf.

Currently the C# Source Generators simplify a lot of things by generating files in memory. But that offers some downsides; namely, not much C# tooling has a good understanding of them today. So there's a lot of design work that will probably go back and forth between compiler and IDE design until something acceptable emerges. Any F# implementation would likely utilize a similar mechanism once it stabilizes.

cartermp avatar Apr 30 '20 16:04 cartermp

As for the suggestion:

I think this is something we'll want to wait a bit for. Firstly, C# Source Generators are just in their first preview and could undergo complete design overhauls between previews based on feedback and scenarios that become more apparent. The current experience exists mostly so that early adopters can try things out, see what is missing or needs to change, and let the team know how it needs to change. There's also a lot of work to do in good IDE integration. Finally, although the string concatenation approach is extremely flexible, it's not necessarily going to stick or be the only way to do things. I personally prefer it to having to learn some complicated API with its own set of bugs and design flaws, but I could see how others would feel the opposite considering that there's pretty much no guarantees around correctness when you're just concatenating strings.

However, I suspect we'll eventually want to implement something that can "hook in" to the libraries that will ultimately end up using Source Generators. The blog post hints at some Microsoft frameworks and libraries adopting them. Realistically that won't happen for quite a while, largely because none of the "adopt source generators" work was costed for the .NET 5 timeframe. Perhaps an early prototype will emerge if Source Generators stabilize early enough. But in the long-term, I expect a lot of the .NET ecosystem to offer a "Source Generator path" for performance and better AOT-compilability. For F# to take part in these benefits, the F# compiler would also need a compatible feature. When the time comes, I think we'll likely look at a variety of options:

  • Encouraging the use of Myriad or a similar library, with some adjustments in the library and/or different frameworks to accept what it emits
  • Implement a new component for the compiler that, perhaps inspired by some of Myriad, allows for a more "F# way" to emit source code and/or other files
  • Integrate with the language, perhaps by extending Type Providers, to offer a very different experience for F# developers
  • Just copy whatever C# ultimately does

Table stakes would be ensuring that what is emitted can be consumed by .NET components so that F# developers can partake in the performance and AOT-ability gains. Post-.NET 5 it is highly likely that the .NET runtime team will focus heavily on AOT since it addresses a lot of pains people have with using .NET in production. This naturally means that some of heavy reliance on reflection in the .NET ecosystem may need a replacement. A Source Generator-like mechanism could be a key part of that for F#. Seriously considering these things is at least a year away though.

cartermp avatar Apr 30 '20 16:04 cartermp

@charlesroddie if there's interest I'm happy to write more about how to use Myriad / how to create plugins for Myriad etc.

realvictorprm avatar Apr 30 '20 16:04 realvictorprm

i think the proper way is to do it like nemerle macros.

OnurGumus avatar May 02 '20 14:05 OnurGumus

@OnurGumus I don't think we'll end up supporting syntactic macros: https://github.com/fsharp/fslang-suggestions/issues/210

cartermp avatar May 02 '20 16:05 cartermp

Hi everyone, thanks for considering this. I want to be clear where I stand: I don't think that Source Generators are a wündertool of meta programming. I do think it's a very practical solution to a very real problem. I appreciate everyone wanting to do original research on hygienic macros, but this suggestion is specifically not that. :-)

My thoughts on the above criticisms:

  • Doesn't integrate with the IDE like TPs That's not true - the design specifically allows for excellent IDE integration. Adhoc script generation thrown into project files does not. This pattern with multi-pass compiler support can give us an IDE experience just as simple as or better (an opportunity to see generated code) than type providers.

  • Just use <insert hygienic macro systems> Yeah, not what I want. F# has tried the "let's do macros without doing macros" thing. It was a good experiment. But it's too easy to overcomplicate the solution to this problem. While I love that F# community is made up of super programmers, it wouldn't be bad, just this once, to implement a simple feature with small ambitions.

  • We don't need this, just write a script Anyone who has gone this route knows the problems. The biggest is always needing a bootstrapping compile run that always fails. You know about conflicts with other build steps and getting the order right. You know about finding paths in MSBuild environments. You know about file system permissions. Writing MSBuild tasks is also a fraught experience given the propensity of MSBuilds to change the way they work.

  • ASTs are superior to text No. We program using text editors for a reason. Forcing us to write against these will require a huge investment of time to learn the F# AST. Every programmer knows how to generate a text file. We have large powerful libraries for working with text. While ASTs can save you from a few syntactic errors (the easy part), they don't save you from anything else. It is a whole lot of pain to put on a programmer just for pedantry's sake.

Closing thoughts:

The C# ecosystem is about to be flooded with Source Generators. Already F# lags behind C# in tooling support. For example, Xamarin supports code generation for CoreML for C# but does not offer it for F#. The same is true for Storyboard support and XIB support and XML code behind. With every year, F# tooling falls behind C#. It is my opinion that it would benefit the F# community greatly to make writing tooling for F# easier.

praeclarum avatar May 02 '20 16:05 praeclarum

@praeclarum Just a note about tooling, I think that view is quite Xamarin-focused and not representative of where most developers are.

Xamarin is heavily C#-focused today, in large part due to how project integration tooling works, since it uses the so-called "legacy" project system and flavoring. This old technology is feature-rich but inflexible. Most of the .NET Core/Standard-based stuff uses a different, far more flexible system that has led to tooling that is about as equally available for F# as it is for C#. One example is the Azure-based tooling that is equally available for F# projects. Additionally, some of the excellent API design of things like ASP.NET Core has allowed for more F#-friendly entry points to emerge (ASP.NET Core supports F# async without requiring conversion to task, Giraffe and Saturn build directly atop its abstractions, etc.)

Perhaps future Xamarin components can be designed in a more pluggable way, like ASP.NET Core, and not require things like the enormous amount of work that was required to light up Fabulous (which also cannot plug into lots of the Visual Studio-based tooling). I anticipate it being easier to support F# in the future with Xamarin with the team moving their project integration tooling to the same system that .NET SDK-style projects use.

Tooling for consuming source generators written in C# is also something to consider, and this would fall square in the "F# team that does tooling" realm to implement. I expect this to be important as more are available.

cartermp avatar May 02 '20 17:05 cartermp

I will 100% concede that I work on an uncommon platform compared to the rest of the community, but I hope you'll welcome diverse perspectives. Plus, Microsoft states that Source Generators are the recommended solution to the problems linkers present in .NET - a problem Xamarin devs have over a decade of experience with that is now becoming a very real problem in .NET Core.

Instead, build-time source generators will be the recommended mitigation for arbitrary reflection use. -Jan Kotas

I could have listed examples other than Xamarin. Protocol buffers could have been another example. I have been playing with a new version of sqlite-net that uses source generators (though I might end up with an IL masher to make it work with F#). I am also currently working on a library to assist mapping functional structures to object-oriented components (Fom) that could benefit from this technology.

Anyway, thanks again for the consideration!

praeclarum avatar May 02 '20 18:05 praeclarum

One small thing should be noted: file ordering puts F# at a disadvantage vs C# regarding source generators (or Myriad). If I have a generator that, say, generates serialization code for annotated types, in F# I can't declare a type and use its serialization within the same file, because the generated code would need to be in-between. Whereas in C#, that's not a problem, there's just a cyclic reference between your file and the generated one. Type providers don't have this inconvenient either because they generate code at the right place.

  • ASTs are superior to text No. We program using text editors for a reason. Forcing us to write against these will require a huge investment of time to learn the F# AST. Every programmer knows how to generate a text file. We have large powerful libraries for working with text. While ASTs can save you from a few syntactic errors (the easy part), they don't save you from anything else. It is a whole lot of pain to put on a programmer just for pedantry's sake.

I don't fully agree with your points here, but I still think that generating text is a good idea for a simple reason: it's much easier to cater to both preferences by making a helper library that provides an AST and generates text, than the other way around.

Tarmil avatar May 02 '20 20:05 Tarmil

@Tarmil Yeah, file ordering does limit what you could do with F# in that way. I think that scenario in particular would be confusing to never enable, but without doing something special like treating the file and the generated file as if their constructs were recursively declared, it would be the way things are.

Another thing to consider is what supporting allowing one generator to depend on the output of another would look like. This implies another form of ordering, which I'm not particularly fond of given how top-down ordering is already difficult for beginners to grok.

cartermp avatar May 02 '20 22:05 cartermp

Interesting point about file ordering @Tarmil .

Type providers have better safety than source generators as you can provide them with the input directly. They don't analyze your entire source (unless you are crazy enough to point them at your .fs files) and you only use the results in the places you specify. They suit F# as a safer, more explicit language.

Enhancements to type provieers mentioned here:

  1. Making them easier to write. (AST helpers? A type provider to generate an AST from a string @praeclarum ?),
  2. Performance in IDEs, including compiling only when needed.
  3. Generate types from types (@kerams). This would deal with some of the cases here, in particular serialization (to replace reflection) including protocol buffers.

How much would remain if this work were done?

charlesroddie avatar May 03 '20 17:05 charlesroddie

Storyboard support and XIB support and XML

Here it's a matter of interop because the economics don't support F#-specific solutions for everything.

Can type providers be used in C#? Imagine we relegate erasing type providers to a historical footnote. Then you can use them by referencing F# projects. Could they be used directly? For example you write some annotation [TypeProvider(TypeProvider,TypeOrStringToAnalyze)] in a C# project. Then a C# source generator looks at it that automatically gets the type provider to generate the type, which gets compiled to IL and referenced in the source generation step.

Can C# source generators be used in F#? Say you have ProtoBuf generator which takes the source for a type as input. Then from F# you have a type, compile it to IL, decompile it to C#, feed it to the source generator, get enlarged source as output, compile it to IL, and reference it. Feasible or too many steps?

I agree with @praeclarum that we need to think hard about how people using these language features can create .Net solutions rather than language-specific solutions.

charlesroddie avatar May 03 '20 17:05 charlesroddie

Im happy Myriad was mentioned here, feel free to add any ideas, improvements, ideas etc to the issues: https://github.com/MoiraeSoftware/myriad

7sharp9 avatar May 04 '20 10:05 7sharp9

One reason, I dislike text based generation is adding complexity of multi-pass compilation and adding to the performance bloat in the compiler. Another reason, I dislike is, F# being a white-space sensitive language, it will probably make incredibly harder to get source generation right. I feel Myriad provide a good base to build features on top of it and suggestion to pass types to TP is the way forward.

Swoorup avatar May 04 '20 10:05 Swoorup

When I was building Myriad I did think of removing the quotation aspect of Type Providers and instead have just AST input rather than quotations. I think quotations not quite mapping 1 to 1 over the F# language can be a big limitation with regards to generating source, especially as quotations transform the input into a quoted from and cannot represent types either. Myriad could be called as part of the compile chain as there is an input into the compiler accepting an AST. Currently it can be integrated via MSBuild or by calling it direct with the CLI tool.

7sharp9 avatar May 04 '20 10:05 7sharp9

When I was building Myriad I did think of removing the quotation aspect of Type Providers and instead have just AST input rather than quotations. I think quotations not quite mapping 1 to 1 over the F# language can be a big limitation with regards to generating source, especially as quotations transform the input into a quoted from and cannot represent types either. Myriad could be called as part of the compile chain as there is an input into the compiler accepting an AST. Currently it can be integrated via MSBuild or by calling it direct with the CLI tool.

I hope we could deal with existing ASTs instead of creating more and more of them. https://github.com/fsharp/FSharp.Compiler.Service/issues/938

Thorium avatar May 07 '20 11:05 Thorium

There the typed and untyped last, the typed AST is not user constructible so it only really leaves the untyped one, which also has an entry path into the compiler and fantoms for turning back into F# source.

The typed AST is only really currently useful for transpiling an F# cast to another language as it has no API for modification or construction.

You can convert a quotation back to an AST other the process is not perfects as data is lost in the initial quotation literal process, programmatic quotation construction does not cover the whole F# language either so not ideal.

7sharp9 avatar May 07 '20 11:05 7sharp9

Now that we're talking about AST... :) I did a lot of C# expression tree (and the lambda syntax sugar) metaprogramming, and really wish the F# counterpart is on par with that. A full-fledged AST, convertible with quotations, can compile and run, would be even more useful than source generators in my opinion:

  • Runs faster, and does it right (no parser involvement and no syntax issues)
  • Composable (think about embedding a piece of code in a source generator? escapes? indents? name scopes? side effects? all kinds of stuff. With AST it's just children, dictionaries etc.)
  • Usable at both design-time and run-time. (a source generator in the runtime, without proper AST, is a security issue, and prone to injection attacks)
  • Easy to type-annotate.

yatli avatar May 07 '20 11:05 yatli

btw, a lot of projects (MS Bond, protobuf, GraphEngine etc.) already have this code generation workflow by using custom MSBuild tasks. So I don't think the workflow is something new, but how the mechanism generates executable bits (source? AST? etc.) is to be carefully designed.

I wrote both versions of codegen for GraphEngine (it generates millions of lines of code for modeling strongly-typed knowledge graph) -- the first version is done in C# with string concatenation and the coding/debugging experience is horrible.

In the second version I came up with something pretty unique -- it's a meta-template system that generates code generators. I made rules that the meta templates must compile fine themselves, with the "holes" properly annotated. The meta generator then transforms the meta template into generators, which takes user input, and generate source code.

If F# is indeed going to implement the source generators, I wish it is not a CodeDom style API compile: string -> Assembly but more like the "meta generator" :)

yatli avatar May 07 '20 12:05 yatli

@yatli Have a look at Myriad, I recently updated the readme to be a little more descriptive.

7sharp9 avatar May 07 '20 12:05 7sharp9

@7sharp9 thanks! I surfed through the README.md and also your blog. It takes types as input, and use plugins to generate AST and then translate back to source code, right?

yatli avatar May 07 '20 13:05 yatli

The typed AST is only really currently useful for transpiling an F# cast to another language as it has no API for modification or construction

Could some kind of typed AST API construction be the right way to continue? (I'm sure this needs to be approved in principal first, not just a new PR.) The untyped AST needs quite lot of work to be used (and is potentially even a bit dangerous: the parser should really understand everything, as F# is not side-effect-free language).

Thorium avatar May 07 '20 13:05 Thorium

@yatli Technically it takes an AST as input, then creates an AST fragment from it then translates that to source code, which is included in your project. It could take other things as input too, its just the current API is an input file/AST.

7sharp9 avatar May 07 '20 13:05 7sharp9