clangir icon indicating copy to clipboard operation
clangir copied to clipboard

How should cir.base_class_addr work

Open Lancern opened this issue 1 year ago • 2 comments

Recently I'm trying to implement CIRGenFunction::getAddressOfBaseClass.

https://github.com/llvm/clangir/blob/38e962d9fc72d626ec43f4006011f85d5f14dc02/clang/lib/CIR/CodeGen/CIRGenClass.cpp#L1353-L1358

Given the address of a derived class object, this function generates necessary CIR operations to convert it into an address of a derived class object. I noticed that there is a CIR operation exactly for this purpose: cir.base_class_addr. However:

  • Current implementation does not seem to generate cir.base_class_addr except for when the base class subobject is at the beginning of the derived class object. In other cases, it emits a series of pointer operations (currently unimplemented) on the derived class object address and then emits a bitcast.
  • cir.StructType does not encode any inheritance information so cir.base_class_addr shoud not know how to do the conversion.

Based on the points above, what is the expected approach to perform derived-to-base casts? Should we make cir.base_class_addr work and replace those pointer adjustment operations with a single cir.base_class_addr operation?

Lancern avatar Jan 03 '24 15:01 Lancern

  • Current implementation does not seem to generate cir.base_class_addr except for when the base class subobject is at the beginning of the derived class object.

Looks like we only have the most simple thing implemented.

In other cases, it emits a series of pointer operations (currently unimplemented) on the derived class object address and then emits a bitcast.

Ideally we should make this emit cir.base_class_addr and during LoweringPrepare lower down to what we currently emit.

  • cir.StructType does not encode any inheritance information so cir.base_class_addr shoud not know how to do the conversion.

This can be changed/extended by either changing struct type or using more operations to tie up inheritance information, changing the struct type sounds like the more natural approach, but the actual work on fixing this issue will tell us better how to model this (if things change).

Based on the points above, what is the expected approach to perform derived-to-base casts? Should we make cir.base_class_addr work and replace those pointer adjustment operations with a single cir.base_class_addr operation?

Yep.

@smeenai might be resuming work on this for us. This is the roadmap I suggest:

  1. Lower the current existing cir.base_class_addr in LoweringPrepare. If we find out we are missing information and need to change the operation, this is part of the game.
  2. Get the existing series of pointer operations to use cir.base_class_addr instead, emit current existing code in LoweringPrepare. This might require improving either the struct type and/or augmenting the definition of cir.base_class_addr.
  3. Whatever works comes next.

One quick note is that we can always use the AST during lowering prepare, so there's a balance between adding to much information into an operations versus reusing information directly from the AST, depending on the design choices.

bcardosolopes avatar Sep 24 '24 18:09 bcardosolopes

I'm noting down what I've researched here so far. This is probably old news for everyone else, I'm just writing it down for my own benefit.

Laying out records (classes, structs, and unions) is initially handled by RecordLayoutBuilder, which handles the ABI layout rules. This generates an ASTRecordLayout, which is then used by CGRecordLayoutBuilder or CIRRecordLayoutBuilder to convert the record layout into an LLVM or ClangIR struct (respectively). ClangIR structs model LLVM structs quite closely right now, which is convenient for lowering but might not be convenient for high-level operations. That's hypothetical until we know what sorts of high-level operations might be useful though; one that comes to mind is converting arrays of structs to structs of arrays, similar to Zig, although that's being attempted at the LLVM level anyway, and I haven't thought at all about what representational changes (if any) would be useful for it.

It's not directly relevant to this issue, but vtables are similarly laid out as arrays of function pointers instead of a higher-level type. That was discussed in https://github.com/llvm/clangir/pull/569#discussion_r1581866568, but I don't think anything has changed on that front yet.

For cast expressions, Sema builds up a path of the base classes involved in the cast; for a derived-to-base cast, the path starts from the derived class (but excludes the derived class itself) and ends at (and includes) the base class. (Code references: Sema::CheckDerivedToBaseConversion, CXXRecordDecl::isDerivedFrom, CXXBasePaths::lookupInBases.) From reading the code, I'm pretty sure the order is derived-to-base even for a base-to-derived cast, but I haven't verified that empirically. If there are any virtual bases involved, the path starts at the virtual base closest to the base class (which is possibly the base class itself), and CodeGen relies on this.

The CodeGen part is handled by CodeGenFunction::GetAddressOfBaseClass, which walks the cast path to determine and apply the virtual and non-virtual offsets. Note that the virtual offset has to be looked up at runtime via the vtable (or vbtable in the Microsoft ABI) except for final classes, so our choice of CIR struct representation doesn't affect anything on that front. The virtual offset lookup uses types in the AST (VTableContextBase and friends) to find the appropriate offsets.

Everything so far can be handled pretty straightforwardly by using the AST node attached to a cir.struct, and given that it's mostly dealing with information computed by the AST, I think it's the right way to go (vs. duplicating that information). LoweringPrepare has its own LoweringPrepareItaniumCXXABI class, so we'd have to implement the virtual offset lookup logic there, but I think it should be doable.

The non-virtual offset application is a bit more interesting. CodeGen walks the full cast path to determine the offsets to apply at each step, whereas cir.base_class_addr is only recording the cast destination (https://godbolt.org/z/Pazffnzve). I can see a few ways to make that work; what do people think is the best option here?

  • Have cir.base_class_addr record the full cast path instead of just the destination.
  • Have cir.base_class_addr store an AST reference to the CastExpr so it can use it to access the cast path.
  • Generate multiple cir.base_class_addr ops, one for each step in the cast path.
  • Record some additional information in cir.struct. Note that ambiguous cases like https://godbolt.org/z/od97ahWEz need additional disambiguating casts anyway, so I don't think we need to worry about those. There's a Microsoft extension to access direct ambiguous base classes (https://godbolt.org/z/Pnex5f8MT), so it should be sufficient to record the direct base class offset (if any) for any ambiguous base classes.

My inclination would be changing cir.base_class_addr here (I don't have a strong preference between the three possibilities), but I'm open to changing cir.struct too (and of course any other options I haven't though of). Handling virtual base classes will require some modification to cir.base_class_addr regardless, since we need to record the virtual base in some way, unless we lower that part in CIRGen directly instead of as part of the base_class_addr op.

One last thought is that we won't have access to the AST during actual lowering, so relying on the AST (if we go that route) necessitates handling this op before lowering. LoweringPrepare runs before flattening, so we'd be losing high-level information in flat CIR. We're already doing similar lowering in LoweringPrepare though (e.g. for dynamic_cast), so I imagine that's not a concern here either.

smeenai avatar Sep 27 '24 07:09 smeenai

As a side note to this thread, cir.base_class_addr is not the only operation who is interested in the conversion path. Another operation also interested in this information is the conversion between member pointers, where the offset of the base class subobject is also essential in completing the operation. So any approaches proposed here may also apply to that operation.

Lancern avatar Sep 27 '24 18:09 Lancern

The first step I'm working on here (as suggested by Bruno) is lowering the existing case where cir.base_class_addr is emitted, i.e. when it's a zero-offset type cast. We could just lower it to a pure type cast for now (knowing that it'll only be emitted for that case), but that feels like a bit of a cop-out. Doing any sort of actual lowering would require having access to the full cast path, which I could think of doing in the following ways (from my previous comment):

  • Have cir.base_class_addr record the full cast path instead of just the destination.
  • Have cir.base_class_addr store an AST reference to the CastExpr so it can use it to access the cast path.
  • Generate multiple cir.base_class_addr ops, one for each step in the cast path.

Alternatively, we could record both direct and transitive base offset information in cir.struct itself, but Clang in general seems to prefer to keep only information about direct members and bases and walk the inheritance chain to get information from parent classes (which makes sense IMO, since storing transitive information duplicates things and can blow up in degenerate cases).

@Lancern, you mentioned handling the conversion between member pointers. At least for simple cases like https://godbolt.org/z/45oKh3584, the AST contains a derived-to-base cast before the member pointer dereference, so we should still end up in the cir.base_class_addr path. Did you have cases in mind where we'd bypass the cast and the dereference itself would need access to type conversion info?

@bcardosolopes, @Lancern, what do you think between the three alternatives above, or do you have suggestions for something better?

smeenai avatar Sep 30 '24 18:09 smeenai

@smeenai, Re first message:

Everything so far can be handled pretty straightforwardly by using the AST node attached to a cir.struct, and given that it's mostly dealing with information computed by the AST, I think it's the right way to go (vs. duplicating that information). LoweringPrepare has its own LoweringPrepareItaniumCXXABI class, so we'd have to implement the virtual offset lookup logic there, but I think it should be doable.

Sounds good to me!

My inclination would be changing cir.base_class_addr here (I don't have a strong preference between the three possibilities), but I'm open to changing cir.struct too (and of course any other options I haven't though of).

I also believe that we can pull this off without changing the struct type, that would be best, it will have a memory cost to encode this on a very common type.

Handling virtual base classes will require some modification to cir.base_class_addr regardless, since we need to record the virtual base in some way, unless we lower that part in CIRGen directly instead of as part of the base_class_addr op.

Sounds good to me too.

so we'd be losing high-level information in flat CIR

Yep, but that's on purpose.

bcardosolopes avatar Oct 04 '24 00:10 bcardosolopes

@smeenai, Re second message:

  • Have cir.base_class_addr record the full cast path instead of just the destination.

I like this one better. Already helps to decouple a bit from the AST. Still a question whether this is enough info for lowering later or if we'll end up having to encode more, or postpone more.

  • Have cir.base_class_addr store an AST reference to the CastExpr so it can use it to access the cast path.

Depending on the CastExpr content it might be harder to tackle this post CIRGen, but if feasible is also a good route.

  • Generate multiple cir.base_class_addr ops, one for each step in the cast path.

Probably having one operation with the full path might be better if the intermediate results are not used.

bcardosolopes avatar Oct 04 '24 00:10 bcardosolopes

I implemented cir.base_class_addr, including for non-zero offsets, in PR #937, not knowing that this issue existed. I puzzled over some of the same questions, and changed my mind about four times on how to proceed when working on it. Here is the approach I finally landed on:

Conversions to virtual base classes are not implemented in that PR. But when they are implemented, I don't think they should use cir.base_class_addr. I am fine with having CIR code gen directly create the loads and ptr_strides necessary to get the correct address from the virtual table. If we want to keep the higher-level semantics of a base class conversion in CIR, then I suggest a new cir.virtual_base_class_addr operation, since a virtual base class is such a different thing than a regular base class.

To get the correct offset during lowering, I modified cir.base_class_addr, but not in any of the ways that @smeenai suggested. I added a field to store the byte offset. The front end can easily compute the offset, since it still has the entire base class chain. Lowering only needs the offset, not the entire base class chain. So it seemed easiest to store just the byte offset, not the entire chain.

cir.base_class_addr also stores whether or not a null-pointer check is necessary. The null-pointer check seems like an implementation detail, best done late in the lowering process. But whether a null-pointer check is even necessary is based a lot on context, so only the front end knows if it is needed. So it is easiest to store that in the operation.

dkolsen-pgi avatar Oct 04 '24 00:10 dkolsen-pgi

Within the compiler-generated part of a constructor or destructor, I changed CIR code gen to use a cir.base_class_addr instead of an incorrect cir.ptr_stride. There might be other places that use a cir.ptr_stride that should also be changed to use cir.base_class_addr, but I didn't go looking for those.

dkolsen-pgi avatar Oct 04 '24 00:10 dkolsen-pgi

Yeah, the design point of splitting responsibilities between CIRGen and Lowering hadn't occurred to me :) A few thoughts:

  • I'm a little torn on putting the byte offset in the base_class_addr op. On the one hand, it's pragmatic and eases lowering. On the other hand, it feels like a pretty low-level detail to be putting into an op, and it opens up the possibility of the offset argument not matching the base class argument (though that'd be a clear code generation bug). I guess it depends on what level we want to keep CIR ops at, and I'm still learning about that myself, so I'd love to hear other people's thoughts here.
  • The byte offset also threw me off initially because I thought we usually did struct getelementptrs in terms of their fields instead of their bytes, but it turns out regular CodeGen uses fields for field access and bytes for parent class access (https://godbolt.org/z/1bTTKM4Wb), so we're actually matching it there. There's a plan to remove getelementptr anyway, so it's ultimately moot.
  • I noticed you added a direct lowering for the op, whereas we were discussing implementing that in LoweringPrepare here (though that was also necessary because of the usage of AST information, which your design avoids). I'd still be curious to learn if there's reasons to prefer LoweringPrepare vs. direct lowering with your design.
  • Though we aren't concerned about it in this issue, I'm similarly interested in what people think about a separate virtual_base_class_addr op vs. reusing base_class_addr (perhaps with an attribute to distinguish virtual vs. non-virtual lookups). Like you said, a virtual base class is a pretty different operation (in particular it involves a runtime calculation instead of a compile-time one), and having that reflected in the IR might be useful. At the same time, it's still a base class cast, and we might want to treat all base class casts uniformly in some cases. That's a problem for later though.

smeenai avatar Oct 04 '24 05:10 smeenai

I strive to find the right balance between theoretical purity and practical usefulness. I think storing the offset in the base_class_addr op strikes that balance. The front end already needs to know about class layouts so it can implement sizeof, alignof and __builtin_offsetof as compile-time values. The offset of a base class has already been computed by the front end. The offset shouldn't be thrown away when creating CIR only to have a later pass have to recompute it. The offset is a small amount of information we know will be needed later and that is non-trivial to compute. As a practical matter that's the kind of thing you want to cache.

I am unfamiliar with the LoweringPrepare pass. I haven't touched it yet. If this is best handled by doing something during LoweringPrepare, I'm fine with that, and would welcome a chance to learn something new.

dkolsen-pgi avatar Oct 04 '24 06:10 dkolsen-pgi

I strive to find the right balance between theoretical purity and practical usefulness.

Definitely – like I said, I'm still learning myself what the right point is, and I'm glad to have your change to learn from :)

My understanding is that LoweringPrepare is traditionally used to go from higher-level to lower-level ops. https://godbolt.org/z/xEf6WM4Gh is an example where LoweringPrepare converts the cir.dyn_cast op to lower-level operations. It has the AST available to it, so it can query for and use frontend information as needed.

smeenai avatar Oct 04 '24 18:10 smeenai

I implemented cir.base_class_addr, including for non-zero offsets, in PR #937, not knowing that this issue existed. I puzzled over some of the same questions, and changed my mind about four times on how to proceed when working on it.

Thanks for sharing your thoughts @dkolsen-pgi

If we want to keep the higher-level semantics of a base class conversion in CIR, then I suggest a new cir.virtual_base_class_addr operation, since a virtual base class is such a different thing than a regular base class.

+1

To get the correct offset during lowering, I modified cir.base_class_addr, but not in any of the ways that @smeenai suggested. I added a field to store the byte offset. The front end can easily compute the offset, since it still has the entire base class chain. Lowering only needs the offset, not the entire base class chain. So it seemed easiest to store just the byte offset, not the entire chain.

Sounds good to me!

I noticed you added a direct lowering for the op, whereas we were discussing implementing that in LoweringPrepare here (though that was also necessary because of the usage of AST information, which your design avoids). I'd still be curious to learn if there's reasons to prefer LoweringPrepare vs. direct lowering with your design.

+1 here, LoweringPrepare seems like a good place for this.

and we might want to treat all base class casts uniformly in some cases. That's a problem for later though.

One solution is to keep multiple operations and abstract the uniform part into an interface. Seems like the difference makes it worth to have different mnemonics.

The offset of a base class has already been computed by the front end. The offset shouldn't be thrown away when creating CIR only to have a later pass have to recompute it.

Agreed, whenever we have an analysis that would like to look closer to those access it might make sense to raise it more.

I am unfamiliar with the LoweringPrepare pass. I haven't touched it yet. If this is best handled by doing something during LoweringPrepare, I'm fine with that, and would welcome a chance to learn something new.

In LoweringPrepare we usuall decouple more higher level CIR down to more basic CIR (closer to LLVM). One example are global ctor/dtor initializers, they are lowered to actual global functions before we lower To LLVM. The idea is to handle a bunch of those operations there, so that their lowering can be trivial (because we already decomposed the pieces). So for example, if cir.base_class_addr is a pointer calculation, it should become a cir.ptr_diff or cir.ptr_stride, which LowerToLLVM already knows how to transform. (I missed @smeenai comment when I wrote this, but same spirit)

I'm going to look at your PR now and I'll add other comments.

bcardosolopes avatar Oct 05 '24 00:10 bcardosolopes

@Lancern, you mentioned handling the conversion between member pointers. At least for simple cases like https://godbolt.org/z/45oKh3584, the AST contains a derived-to-base cast before the member pointer dereference, so we should still end up in the cir.base_class_addr path. Did you have cases in mind where we'd bypass the cast and the dereference itself would need access to type conversion info?

@smeenai Sorry for the delayed response. Consider this case which casts a member pointer directly: https://godbolt.org/z/3eT59djsT

It's a bit different from cir.base_class_addr though: member pointer casts go from base to derived while cir.base_class_addr goes from derived to base. Besides, member pointer casts cannot go across virtual bases.

Lancern avatar Oct 09 '24 04:10 Lancern

I think we can close this out given that we have a working cir.base_class_addr now. We can open up other issues for virtual bases and member pointer casts when we get to those.

smeenai avatar Oct 14 '24 19:10 smeenai