clangir
clangir copied to clipboard
How should cir.base_class_addr work
Recently I'm trying to implement CIRGenFunction::getAddressOfBaseClass
.
https://github.com/llvm/clangir/blob/38e962d9fc72d626ec43f4006011f85d5f14dc02/clang/lib/CIR/CodeGen/CIRGenClass.cpp#L1353-L1358
Given the address of a derived class object, this function generates necessary CIR operations to convert it into an address of a derived class object. I noticed that there is a CIR operation exactly for this purpose: cir.base_class_addr
. However:
- Current implementation does not seem to generate
cir.base_class_addr
except for when the base class subobject is at the beginning of the derived class object. In other cases, it emits a series of pointer operations (currently unimplemented) on the derived class object address and then emits a bitcast. -
cir.StructType
does not encode any inheritance information socir.base_class_addr
shoud not know how to do the conversion.
Based on the points above, what is the expected approach to perform derived-to-base casts? Should we make cir.base_class_addr
work and replace those pointer adjustment operations with a single cir.base_class_addr
operation?
- Current implementation does not seem to generate
cir.base_class_addr
except for when the base class subobject is at the beginning of the derived class object.
Looks like we only have the most simple thing implemented.
In other cases, it emits a series of pointer operations (currently unimplemented) on the derived class object address and then emits a bitcast.
Ideally we should make this emit cir.base_class_addr
and during LoweringPrepare lower down to what we currently emit.
cir.StructType
does not encode any inheritance information socir.base_class_addr
shoud not know how to do the conversion.
This can be changed/extended by either changing struct type or using more operations to tie up inheritance information, changing the struct type sounds like the more natural approach, but the actual work on fixing this issue will tell us better how to model this (if things change).
Based on the points above, what is the expected approach to perform derived-to-base casts? Should we make
cir.base_class_addr
work and replace those pointer adjustment operations with a singlecir.base_class_addr
operation?
Yep.
@smeenai might be resuming work on this for us. This is the roadmap I suggest:
- Lower the current existing
cir.base_class_addr
in LoweringPrepare. If we find out we are missing information and need to change the operation, this is part of the game. - Get the existing series of pointer operations to use
cir.base_class_addr
instead, emit current existing code in LoweringPrepare. This might require improving either the struct type and/or augmenting the definition ofcir.base_class_addr
. - Whatever works comes next.
One quick note is that we can always use the AST during lowering prepare, so there's a balance between adding to much information into an operations versus reusing information directly from the AST, depending on the design choices.
I'm noting down what I've researched here so far. This is probably old news for everyone else, I'm just writing it down for my own benefit.
Laying out records (classes, structs, and unions) is initially handled by RecordLayoutBuilder, which handles the ABI layout rules. This generates an ASTRecordLayout
, which is then used by CGRecordLayoutBuilder or CIRRecordLayoutBuilder to convert the record layout into an LLVM or ClangIR struct (respectively). ClangIR structs model LLVM structs quite closely right now, which is convenient for lowering but might not be convenient for high-level operations. That's hypothetical until we know what sorts of high-level operations might be useful though; one that comes to mind is converting arrays of structs to structs of arrays, similar to Zig, although that's being attempted at the LLVM level anyway, and I haven't thought at all about what representational changes (if any) would be useful for it.
It's not directly relevant to this issue, but vtables are similarly laid out as arrays of function pointers instead of a higher-level type. That was discussed in https://github.com/llvm/clangir/pull/569#discussion_r1581866568, but I don't think anything has changed on that front yet.
For cast expressions, Sema builds up a path of the base classes involved in the cast; for a derived-to-base cast, the path starts from the derived class (but excludes the derived class itself) and ends at (and includes) the base class. (Code references: Sema::CheckDerivedToBaseConversion, CXXRecordDecl::isDerivedFrom, CXXBasePaths::lookupInBases.) From reading the code, I'm pretty sure the order is derived-to-base even for a base-to-derived cast, but I haven't verified that empirically. If there are any virtual bases involved, the path starts at the virtual base closest to the base class (which is possibly the base class itself), and CodeGen relies on this.
The CodeGen part is handled by CodeGenFunction::GetAddressOfBaseClass, which walks the cast path to determine and apply the virtual and non-virtual offsets. Note that the virtual offset has to be looked up at runtime via the vtable (or vbtable in the Microsoft ABI) except for final classes, so our choice of CIR struct representation doesn't affect anything on that front. The virtual offset lookup uses types in the AST (VTableContextBase and friends) to find the appropriate offsets.
Everything so far can be handled pretty straightforwardly by using the AST node attached to a cir.struct
, and given that it's mostly dealing with information computed by the AST, I think it's the right way to go (vs. duplicating that information). LoweringPrepare has its own LoweringPrepareItaniumCXXABI
class, so we'd have to implement the virtual offset lookup logic there, but I think it should be doable.
The non-virtual offset application is a bit more interesting. CodeGen walks the full cast path to determine the offsets to apply at each step, whereas cir.base_class_addr
is only recording the cast destination (https://godbolt.org/z/Pazffnzve). I can see a few ways to make that work; what do people think is the best option here?
- Have
cir.base_class_addr
record the full cast path instead of just the destination. - Have
cir.base_class_addr
store an AST reference to theCastExpr
so it can use it to access the cast path. - Generate multiple
cir.base_class_addr
ops, one for each step in the cast path. - Record some additional information in
cir.struct
. Note that ambiguous cases like https://godbolt.org/z/od97ahWEz need additional disambiguating casts anyway, so I don't think we need to worry about those. There's a Microsoft extension to access direct ambiguous base classes (https://godbolt.org/z/Pnex5f8MT), so it should be sufficient to record the direct base class offset (if any) for any ambiguous base classes.
My inclination would be changing cir.base_class_addr
here (I don't have a strong preference between the three possibilities), but I'm open to changing cir.struct
too (and of course any other options I haven't though of). Handling virtual base classes will require some modification to cir.base_class_addr
regardless, since we need to record the virtual base in some way, unless we lower that part in CIRGen directly instead of as part of the base_class_addr
op.
One last thought is that we won't have access to the AST during actual lowering, so relying on the AST (if we go that route) necessitates handling this op before lowering. LoweringPrepare runs before flattening, so we'd be losing high-level information in flat CIR. We're already doing similar lowering in LoweringPrepare though (e.g. for dynamic_cast), so I imagine that's not a concern here either.
As a side note to this thread, cir.base_class_addr
is not the only operation who is interested in the conversion path. Another operation also interested in this information is the conversion between member pointers, where the offset of the base class subobject is also essential in completing the operation. So any approaches proposed here may also apply to that operation.
The first step I'm working on here (as suggested by Bruno) is lowering the existing case where cir.base_class_addr
is emitted, i.e. when it's a zero-offset type cast. We could just lower it to a pure type cast for now (knowing that it'll only be emitted for that case), but that feels like a bit of a cop-out. Doing any sort of actual lowering would require having access to the full cast path, which I could think of doing in the following ways (from my previous comment):
- Have
cir.base_class_addr
record the full cast path instead of just the destination. - Have
cir.base_class_addr
store an AST reference to theCastExpr
so it can use it to access the cast path. - Generate multiple
cir.base_class_addr
ops, one for each step in the cast path.
Alternatively, we could record both direct and transitive base offset information in cir.struct
itself, but Clang in general seems to prefer to keep only information about direct members and bases and walk the inheritance chain to get information from parent classes (which makes sense IMO, since storing transitive information duplicates things and can blow up in degenerate cases).
@Lancern, you mentioned handling the conversion between member pointers. At least for simple cases like https://godbolt.org/z/45oKh3584, the AST contains a derived-to-base cast before the member pointer dereference, so we should still end up in the cir.base_class_addr
path. Did you have cases in mind where we'd bypass the cast and the dereference itself would need access to type conversion info?
@bcardosolopes, @Lancern, what do you think between the three alternatives above, or do you have suggestions for something better?
@smeenai, Re first message:
Everything so far can be handled pretty straightforwardly by using the AST node attached to a cir.struct, and given that it's mostly dealing with information computed by the AST, I think it's the right way to go (vs. duplicating that information). LoweringPrepare has its own LoweringPrepareItaniumCXXABI class, so we'd have to implement the virtual offset lookup logic there, but I think it should be doable.
Sounds good to me!
My inclination would be changing cir.base_class_addr here (I don't have a strong preference between the three possibilities), but I'm open to changing cir.struct too (and of course any other options I haven't though of).
I also believe that we can pull this off without changing the struct type, that would be best, it will have a memory cost to encode this on a very common type.
Handling virtual base classes will require some modification to cir.base_class_addr regardless, since we need to record the virtual base in some way, unless we lower that part in CIRGen directly instead of as part of the base_class_addr op.
Sounds good to me too.
so we'd be losing high-level information in flat CIR
Yep, but that's on purpose.
@smeenai, Re second message:
- Have
cir.base_class_addr
record the full cast path instead of just the destination.
I like this one better. Already helps to decouple a bit from the AST. Still a question whether this is enough info for lowering later or if we'll end up having to encode more, or postpone more.
- Have
cir.base_class_addr
store an AST reference to theCastExpr
so it can use it to access the cast path.
Depending on the CastExpr
content it might be harder to tackle this post CIRGen, but if feasible is also a good route.
- Generate multiple
cir.base_class_addr
ops, one for each step in the cast path.
Probably having one operation with the full path might be better if the intermediate results are not used.
I implemented cir.base_class_addr
, including for non-zero offsets, in PR #937, not knowing that this issue existed. I puzzled over some of the same questions, and changed my mind about four times on how to proceed when working on it. Here is the approach I finally landed on:
Conversions to virtual base classes are not implemented in that PR. But when they are implemented, I don't think they should use cir.base_class_addr
. I am fine with having CIR code gen directly create the loads and ptr_strides necessary to get the correct address from the virtual table. If we want to keep the higher-level semantics of a base class conversion in CIR, then I suggest a new cir.virtual_base_class_addr
operation, since a virtual base class is such a different thing than a regular base class.
To get the correct offset during lowering, I modified cir.base_class_addr
, but not in any of the ways that @smeenai suggested. I added a field to store the byte offset. The front end can easily compute the offset, since it still has the entire base class chain. Lowering only needs the offset, not the entire base class chain. So it seemed easiest to store just the byte offset, not the entire chain.
cir.base_class_addr
also stores whether or not a null-pointer check is necessary. The null-pointer check seems like an implementation detail, best done late in the lowering process. But whether a null-pointer check is even necessary is based a lot on context, so only the front end knows if it is needed. So it is easiest to store that in the operation.
Within the compiler-generated part of a constructor or destructor, I changed CIR code gen to use a cir.base_class_addr
instead of an incorrect cir.ptr_stride
. There might be other places that use a cir.ptr_stride
that should also be changed to use cir.base_class_addr
, but I didn't go looking for those.
Yeah, the design point of splitting responsibilities between CIRGen and Lowering hadn't occurred to me :) A few thoughts:
- I'm a little torn on putting the byte offset in the
base_class_addr
op. On the one hand, it's pragmatic and eases lowering. On the other hand, it feels like a pretty low-level detail to be putting into an op, and it opens up the possibility of the offset argument not matching the base class argument (though that'd be a clear code generation bug). I guess it depends on what level we want to keep CIR ops at, and I'm still learning about that myself, so I'd love to hear other people's thoughts here. - The byte offset also threw me off initially because I thought we usually did struct getelementptrs in terms of their fields instead of their bytes, but it turns out regular CodeGen uses fields for field access and bytes for parent class access (https://godbolt.org/z/1bTTKM4Wb), so we're actually matching it there. There's a plan to remove getelementptr anyway, so it's ultimately moot.
- I noticed you added a direct lowering for the op, whereas we were discussing implementing that in LoweringPrepare here (though that was also necessary because of the usage of AST information, which your design avoids). I'd still be curious to learn if there's reasons to prefer LoweringPrepare vs. direct lowering with your design.
- Though we aren't concerned about it in this issue, I'm similarly interested in what people think about a separate
virtual_base_class_addr
op vs. reusingbase_class_addr
(perhaps with an attribute to distinguish virtual vs. non-virtual lookups). Like you said, a virtual base class is a pretty different operation (in particular it involves a runtime calculation instead of a compile-time one), and having that reflected in the IR might be useful. At the same time, it's still a base class cast, and we might want to treat all base class casts uniformly in some cases. That's a problem for later though.
I strive to find the right balance between theoretical purity and practical usefulness. I think storing the offset in the base_class_addr
op strikes that balance. The front end already needs to know about class layouts so it can implement sizeof
, alignof
and __builtin_offsetof
as compile-time values. The offset of a base class has already been computed by the front end. The offset shouldn't be thrown away when creating CIR only to have a later pass have to recompute it. The offset is a small amount of information we know will be needed later and that is non-trivial to compute. As a practical matter that's the kind of thing you want to cache.
I am unfamiliar with the LoweringPrepare pass. I haven't touched it yet. If this is best handled by doing something during LoweringPrepare, I'm fine with that, and would welcome a chance to learn something new.
I strive to find the right balance between theoretical purity and practical usefulness.
Definitely – like I said, I'm still learning myself what the right point is, and I'm glad to have your change to learn from :)
My understanding is that LoweringPrepare is traditionally used to go from higher-level to lower-level ops. https://godbolt.org/z/xEf6WM4Gh is an example where LoweringPrepare converts the cir.dyn_cast
op to lower-level operations. It has the AST available to it, so it can query for and use frontend information as needed.
I implemented
cir.base_class_addr
, including for non-zero offsets, in PR #937, not knowing that this issue existed. I puzzled over some of the same questions, and changed my mind about four times on how to proceed when working on it.
Thanks for sharing your thoughts @dkolsen-pgi
If we want to keep the higher-level semantics of a base class conversion in CIR, then I suggest a new
cir.virtual_base_class_addr
operation, since a virtual base class is such a different thing than a regular base class.
+1
To get the correct offset during lowering, I modified
cir.base_class_addr
, but not in any of the ways that @smeenai suggested. I added a field to store the byte offset. The front end can easily compute the offset, since it still has the entire base class chain. Lowering only needs the offset, not the entire base class chain. So it seemed easiest to store just the byte offset, not the entire chain.
Sounds good to me!
I noticed you added a direct lowering for the op, whereas we were discussing implementing that in LoweringPrepare here (though that was also necessary because of the usage of AST information, which your design avoids). I'd still be curious to learn if there's reasons to prefer LoweringPrepare vs. direct lowering with your design.
+1 here, LoweringPrepare seems like a good place for this.
and we might want to treat all base class casts uniformly in some cases. That's a problem for later though.
One solution is to keep multiple operations and abstract the uniform part into an interface. Seems like the difference makes it worth to have different mnemonics.
The offset of a base class has already been computed by the front end. The offset shouldn't be thrown away when creating CIR only to have a later pass have to recompute it.
Agreed, whenever we have an analysis that would like to look closer to those access it might make sense to raise it more.
I am unfamiliar with the LoweringPrepare pass. I haven't touched it yet. If this is best handled by doing something during LoweringPrepare, I'm fine with that, and would welcome a chance to learn something new.
In LoweringPrepare
we usuall decouple more higher level CIR down to more basic CIR (closer to LLVM). One example are global ctor/dtor initializers, they are lowered to actual global functions before we lower To LLVM. The idea is to handle a bunch of those operations there, so that their lowering can be trivial (because we already decomposed the pieces). So for example, if cir.base_class_addr
is a pointer calculation, it should become a cir.ptr_diff or cir.ptr_stride, which LowerToLLVM already knows how to transform. (I missed @smeenai comment when I wrote this, but same spirit)
I'm going to look at your PR now and I'll add other comments.
@Lancern, you mentioned handling the conversion between member pointers. At least for simple cases like https://godbolt.org/z/45oKh3584, the AST contains a derived-to-base cast before the member pointer dereference, so we should still end up in the cir.base_class_addr path. Did you have cases in mind where we'd bypass the cast and the dereference itself would need access to type conversion info?
@smeenai Sorry for the delayed response. Consider this case which casts a member pointer directly: https://godbolt.org/z/3eT59djsT
It's a bit different from cir.base_class_addr
though: member pointer casts go from base to derived while cir.base_class_addr
goes from derived to base. Besides, member pointer casts cannot go across virtual bases.
I think we can close this out given that we have a working cir.base_class_addr
now. We can open up other issues for virtual bases and member pointer casts when we get to those.