clangir icon indicating copy to clipboard operation
clangir copied to clipboard

[CIR][Transforms] Introduce StdVectorCtorOp & StdVectorDtorOp

Open bruteforceboy opened this issue 5 months ago • 7 comments

In this PR, I continue work on Issue#1653.

This PR introduces std.vector_cxx_ctor and std.vector_cxx_dtor building on the special member attributes from PR#1711. Extending the implementation for other std constructors/destructors is quite easy too. For example, to introduce std::map, the only changes needed are:

+++ b/clang/include/clang/CIR/Dialect/IR/CIRStdOps.td
+def CIR_StdMapCtorOp: CIR_StdOp<"map_cxx_ctor",
+  (ins CIR_AnyType:$first),
+  (outs Optional<CIR_AnyType>:$result)>;

+++ b/clang/lib/CIR/Dialect/Transforms/IdiomRecognizer.cpp
@@ -239,7 +239,7 @@ void IdiomRecognizerPass::recognizeCall(CallOp call) {
   using StdFunctionsRecognizer =
       std::tuple<StdRecognizer<StdFindOp>, StdRecognizer<StdVectorCtorOp>,
-                 StdRecognizer<StdVectorDtorOp>>;
+                 StdRecognizer<StdVectorDtorOp>, StdRecognizer<StdMapCtorOp>>;

To get this to work, I updated the CXXCtorAttr and CXXDtorAttr special member attributes to also have the source clang::RecordDecl. A few things that may be worth discussing are:

  • Do we get rid of the mlir::Type in both attributes now that we have clang::RecordDecl?
  • Should we also print/parse the clang::RecordDecl? For now, I think it's unnecessary, because it is pretty much the same information as the mlir::Type.

I have also added one test to demo the added operations. Please, let me know your thoughts. Thanks!

bruteforceboy avatar Jul 29 '25 18:07 bruteforceboy

It's also possible if in the future we want to generalize ctor/dtors in a different way, but I think this is a good start to get more data on how these things are going to be useful.

(cc @tommymcm)

bcardosolopes avatar Aug 07 '25 17:08 bcardosolopes

  • Do we get rid of the mlir::Type in both attributes now that we have clang::RecordDecl?

How hard would be to use mlir::Type to get the same information this PR gets from clang::RecordDecl? Would we need to augment cir.record much? We probably want to keep the record type lightweight, but it's a tradeoff.

bcardosolopes avatar Aug 07 '25 17:08 bcardosolopes

  • Should we also print/parse the clang::RecordDecl? For now, I think it's unnecessary, because it is pretty much the same information as the mlir::Type.

This is harder than it seems, because for that we'd want to use AST serialization/deserialization (and be able to roundtrip tests, etc) - this is a known limitation of keeping the AST around. As you noticed, it forces us into writing tests from source instead of CIR to CIR.

bcardosolopes avatar Aug 07 '25 17:08 bcardosolopes

(cc @andykaylor)

bcardosolopes avatar Aug 07 '25 17:08 bcardosolopes

  • Do we get rid of the mlir::Type in both attributes now that we have clang::RecordDecl?

How hard would be to use mlir::Type to get the same information this PR gets from clang::RecordDecl? Would we need to augment cir.record much? We probably want to keep the record type lightweight, but it's a tradeoff.

Would it be better to just introduce a cir.std.vector type at this point? This would also let us clean up duplicated ops: instead of having a ctor/dtor op for each std type we care about, we could have a single cir.ctor with a type specifier , so cir.std.vector.ctor would become cir.ctor ... : cir.ptr<cir.std.vector<...>>.

I am interested in helping out with this if folks think its a good direction.

(I don't think that this is necessary for this PR, I agree with Bruno that having something working with statistics will help us know what needs to be changed/refined)

tommymcm avatar Aug 07 '25 18:08 tommymcm

Why do we need a vector-specific CIR_StdVectorCtorOp rather than a general CIR_StdCtrOp that takes the class type as an argument/attribute. Which leads me to, why does this need to be limited to std rather than just a general CIR_CallCtorOp with an attribute telling you what it's constructing in a way that the constructee could easily be recognized as a standard libary class.

I tend more towards the CIR_CallCtorOp design (with maybe richer input type that already decoded this-is-a-std-vector-class, like suggested by @tommymcm). It's worth noting that by using the more generic form CIR_CallCtorOp instead of CIR_StdVectorCtorOp we'd be basically shifting the idiom recognition of a std vector ctor to be done whenever we want to check for a specific ctor versus doing that logic only once during creation time of the operation.

The only existing use of ctor information so far is in the lifetime checker and it gets by easily by looking into the called function for the ctor attribute. @bruteforceboy should we wait until you introduce the uses so we can see how much we'd benefit from this?

bcardosolopes avatar Aug 12 '25 01:08 bcardosolopes

Thanks for the suggestions @bcardosolopes @tommymcm @andykaylor

Sorry, I took a very long break because of travels/relocation.

@bcardosolopes I agree with your last comment and I think it makes more sense to add CIR_CallCtorOp. The purpose of this PR was so I could continue working on the vector optimization, since we need to find the constructors/destructors for std::vector during the optimization.

I will take the next few days to think a bit more about the design here, and hopefully reopen this PR with a couple of changes.

bruteforceboy avatar Aug 22 '25 20:08 bruteforceboy