c3c What should the JSON output be for?

There are several things the compiler output could be used for:

Feed to a doc creator
Feed to an LSP
Feed to tooling for generating information about the code (e.g. "Go to definition")
Feed to conversion tools
AST inspector
Other uses

These all pull in different directions.

What's needed to know is: (a) What should be the output (b) What should the organization of the data be.

I am opening this for discussion.

Jul 30 '25 14:07 lerno

Something like clang's JSON Compilation Database would be nice.

Jul 30 '25 17:07 waveproc

Something like clang's JSON Compilation Database would be nice.

Ye this would also be useful, but I think that should be another option altogether.

My opinion is that it should be basically number 2. I'm thinking of what the -E option already outputs, so here's my reasoning: point number 2 and 3 are basically the same thing since an LSP gives the client-editors the ability to navigate definitions excetera. Also, we already have --lsp for error data which would complete the lsp struggles. Number 1 is what I think an external tool would be needed for. It doesn't need the entire ast, just a lexer and a simple reading of comments in source files is enough for a doc creator. The real debate is about number 4 and 5. If those are the target uses then the AST should include not only what it includes now (such as types, modules, functions exc..) but the contents of such functions and code compilation as well. If LSP use is intended instead, we only need what we already have now and just add the sourcespan (I.e. the position of function, types, module definitons...)

I believe we should just target the LSP first, since it's basically already done and easy to finish, and then later if we want have another option that does also output the ENTIRE AST, including code compilation

Jul 30 '25 23:07 Snifexx

Well, this is just the parser information. If you want something to consume for LSP, it should presumably go through analysis too, but at that point some of the data is actually consumed, so it's not simple to map it back.

Jul 31 '25 14:07 lerno

Sorry if I'm misunderstanding your answer but it seems to me that the emitting of the parser data in json happening in 'emit_json' doesn't necesseraly consume the parser data, it simply formats it into suitable json and outputs it. As of now when using the -E option after outputting the compiler simply returns a success code. If the problem is not being able to return parser data after goiung through analysis we could simply format to json before analysis, keep that formatted string and later append the analysis output that consumes parser data.

Aug 01 '25 14:08 Snifexx

Yes, but if we think about wanting to create mappings. For example, consider int x = FOO

Now in the AST pass before semantic checking this is just "some constant FOO", and for the LSP we want to know the mapping, right? But the semantic check will look at FOO, find it's a constant 3 and then FOLD it so that the expression now is int x = 3.

So how do we feed the compiler's knowledge of where FOO is defined into the LSP? It's not known before or after analysis, only during analysis, which means that to support such information it would have to be a fairly invasive reporting going on in "LSP" mode that would add complexity all around.

Aug 01 '25 18:08 lerno

Ahh, yes I understood, you're right. The only thing I can think off is literally runnin the compiler twice which is ridiculous...

Aug 01 '25 19:08 Snifexx