carbon-lang icon indicating copy to clipboard operation
carbon-lang copied to clipboard

High level design of migrate_cpp

Open asoffer opened this issue 3 years ago • 3 comments

I wanted to ask about the high-level design of the C++ -> Carbon migration tool. It seems to be largely based on a collection of Clang AST-matchers followed by, for each match, an edit generation mechanism. Many clang-tidy checks follow this pattern. My question is whether other designs were considered and what the rational was for this particular design.

For context, because this is a bit of an unfairly leading question, I have suspicions that this pattern will become overly complex in the long-term. In the refactoring tools I have built in the past, AST matchers work well for clang-tidy-ish things where you are targeting a specific statement or expression and the rewrite doesn't need much state carried around. For larger more complex tasks, clang::RecursiveASTVisitor tends to be a much simpler (and more performant) approach.

I'm happy to go into detail about my concerns and why I would expect a different approach to be more effective. In fact, I'd be happy to even reimplement it this way. But I wanted to first understand if there were design constraints or context I was unaware of, and also make sure, if the direction were to shift in design, that there was consensus before I started doing so.

asoffer avatar Jul 21 '22 23:07 asoffer

I think it's reasonable to rewrite. The current approach isn't something we're committed to. As chandlerc noted on Discord, we stopped focusing on this because (while private) we felt spread to thin -- but if the migration tool is really motivating to you (and I believe you have experience in this area) it's reasonable to invest time on it now since it's part of the key interop/migration goals of Carbon.

I'll note, part of why I stopped on this too is it felt like it should change substantially. If you follow up with me directly, I can try digging up some context for you.

Some notes on the tradeoffs would be good for the community, in part as a record of what was considered in the design.

Maybe a short design, maybe in Docs for ease of writing? Contact me directly with an email for contributor access to the shared folder.

jonmeow avatar Jul 22 '22 13:07 jonmeow

I think the big limitation of using AST matcher rewrites for this kind of tool is that they don't compose well. If we have a bunch of different matchers doing different rewrites to the same region of code, when rewriting C++ to C++, we can iterate on them until they converge: do some rewrite, re-parse the result, do another rewrite, rinse, repeat. But when rewriting C++ to Carbon, we can't do that -- once the first matcher runs and emits some replacements, we can end up with code that is not valid C++ nor valid Carbon, and we have no tool to turn that back into a parse tree on which we can do further work.

I think we will find that we need a tool that can take a complete C++ AST for at least an entire source file and rewrite it into Carbon in one go, perhaps by walking the AST and emitting corresponding code as it goes, or perhaps by constructing an intermediate Carbonized representation of the program and then pretty-printing that once it's been suitably fixed up and polished. Probably one of the harder parts of this process will be retaining comments and attaching them to the right parts of the Carbonized representation so that they can be printed in the right places in the resulting code.

zygoloid avatar Aug 25 '22 21:08 zygoloid

@zygoloid FYI #2041 (RecursiveASTVisitor)

jonmeow avatar Aug 25 '22 21:08 jonmeow

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time. This issue is labeled inactive because the last activity was over 90 days ago.

github-actions[bot] avatar Nov 24 '22 02:11 github-actions[bot]

@asoffer has merged some initial work towards RecursiveASTVisitor, and IIUC may get back to it but has been busy. I think we're likely to keep iterating it that direction so I don't think there's much more to discuss here; I'm closing this issue.

jonmeow avatar Nov 28 '22 20:11 jonmeow