Non matching mangled name for C functions using structs
Example:
module main;
struct Foo
{
int a;
}
extern (C) void bar(Foo);
module foo;
struct Foo
{
int a;
}
extern (C) void bar(Foo);
$ ldc2 main.d foo.d
main.d(8): Error: Function type does not match previously declared function with the same mangled name: bar
C functions doesn't have a mangling. I suspect the mangled name that would be used for a D function for the Foo struct is used.
This is a duplicate of long-standing https://github.com/ldc-developers/ldc/issues/1020.
A quick look at the IR shows the problem - 2 equivalent Foo structs in different D modules:
; ldc2 -output-ll main.d
%main.Foo = type { i32 }
declare void @bar(%main.Foo) #0
; ldc2 -output-ll foo.d
%foo.Foo = type { i32 }
declare void @bar(%foo.Foo) #0
This is ~~only~~ mostly a problem with (mostly default/enforced) -singleobj though (i.e., -c should work), where code from all D modules is emitted into a single IR module (=> object file), and we can only have one IR function for a particular (mangled) name. extern(D) functions always feature the module prefix in their mangle, so this only applies to other linkages.
We make sure the IR types match if there are multiple declarations (and only emit the first definition in that case). If we didn't, we'd have to adapt the call sites, e.g., repainting IR foo.Foo call args as main.Foo ones.
Wouldn't the proper way to do this be to not look at the actual name of the struct but instead its members? Basically building up a mangled/IR name combining the mangled names of the members, recursively.
That's definitely something I haven't thought of before, but only imagining the length of some of these names (need to take alignments into account as well, unions...) makes my head ache. I look at IR a lot, and having type names >1k isn't bearable IMO.
Yeah, the might be a problem. But as I mentioned in the other issue https://github.com/ldc-developers/ldc/issues/1020 it's easy to detect if a class/struct is extern(D) or some other linkage.
it's easy to detect if a class/struct is extern(D) or some other linkage.
That has just been added with 2.081 AFAIK. But as your example shows, people don't expect having to declare a struct extern(C[++]) just because it's used in some C[++] function.
We could keep a map of struct identifier (excl. module prefix) to IR types, per IR module. When trying to emit a new IR type, we could check if its IR layout is equivalent to an existing type with the same identifier, and use that existing one instead.
That has just been added with 2.081 AFAIK.
It's been present in ClassDeclaration for a long time, it was recently moved up to AggregateDeclaration to support structs as well. It has also been changed from several bools to one enum. It has existed as long as D has supported C++ classes. But there is no way to identify extern(C) struct because that doesn't have any other semantic differences compared to extern(D) struct.
But as your example shows, people don't expect having to declare a struct extern(C[++]) just because it's used in some C[++] function.
True.
We could keep a map of struct identifier (excl. module prefix) to IR types, per IR module. When trying to emit a new IR type, we could check if its IR layout is equivalent to an existing type with the same identifier, and use that existing one instead.
As long it works I'm happy.
But as your example shows, people don't expect having to declare a struct extern(C[++]) just because it's used in some C[++] function.
This only applies to C functions, right? For C++ functions, the type is part of the mangling: in the OP example when extern(C) is replaced with extern(C++) the two modules would declare two different extern(C++) bar functions, because the two Foos are different types.
I am wary of collapsing types that are different into the same thing; I don't know whether that may interfere badly with LTO+some_future_metadata_for_types+optimization.
Could we emit a struct literal type (instead of named type) for extern(C) functions? I.e. something like this:
declare void @bar({i32}) #0
C++ functions are affected too, the struct doesn't even need to be extern(C++) then, as the C++ mangle (_Z3bar3Foo in both cases) doesn't contain the D module (just extern(C++) namespaces, which can be defined across multiple D modules too).
Could we emit a struct literal type (instead of named type) for extern(C) functions?
Doesn't have the same problem as I suggested https://github.com/ldc-developers/ldc/issues/2782#issuecomment-406056677?
C++ functions are affected too, the struct doesn't even need to be extern(C++) then, as the C++ mangle (_Z3bar3Foo in both cases) doesn't contain the D module (just extern(C++) namespaces, which can be defined across multiple D modules too).
Wow indeed. That's a bad problem that will lead to miscompiles when templates are involved: consider two template instantiations in different compilation units but with different types, they will mangled the same although the types are different and just happen to have the same name, and linking will merge the symbols because that's needed for templates. This sounds like a bug to me: the mangling of C++ functions with D types should use the D type mangling.
Simple example that shows the bug:
static import core.thread;
struct Thread {
int a;
}
extern (C++) void bar(Thread*) {}
extern (C++) void bar(core.thread.Thread) {}
I don't necessarily see that as bug, just one more thing to consider for C++ interop. 'Fixing' it would break a ton of code I guess. Our error for mismatching IR signatures may help in preventing such bugs (mainly with -singleobj).
Anyway, wrt. to the original issue, I'm not sure a special handling (collapsing types etc.) pays off; after all, this is almost always triggered by duplicate equivalent struct definitions and function declarations, and code duplication is smelly. Extracting the duplicate C[++] types/functions into a dedicated D module solves the problem.
If we want to allow this (signatures incompatible on the D level), I'd probably go for implementing it using repainting (replaceAllUsesWith and casting the function pointer), as trying to regularize the type on the LLVM level and getting that right in all cases seems like a lot of complexity.
We could then have a D-type-system implementation of structural type comparison/whatever heuristics we want to come up with to decide whether to present the user with an error message or not.
[...] and code duplication is smelly.
Agreed, but what if two separate D libraries used in a large project wrap the same C/C++ code?
If compiled in separate invokations (and no LTO), there's no problem. - But yeah, there'll always be some legitimate corner cases (template instantiations...).
https://issues.dlang.org/show_bug.cgi?id=19101
replaceAllUsesWith and casting the function pointer
Yeah, repainting the function pointer instead of the args should be a lot simpler indeed.
I ran in this today as well:
/usr/include/dlang/ldc/core/stdc/stdio.d(1278,8): Error: Function type does not match previously declared function with the same mangled name: fwrite
/usr/include/dlang/ldc/core/stdc/stdio.d(1278,8): Previous IR type: i64 (i8*, i64, i64, %libxlsxd.xlsxwrap._IO_FILE*)
/usr/include/dlang/ldc/core/stdc/stdio.d(1278,8): New IR type: i64 (i8*, i64, i64, %core.stdc.stdio._IO_FILE*)
this breaks xlsxd builds. this is with ldc 1.15.0
the structs are different though.
the structs are different though
So xlsxd is duplicating the _IO_FILE struct definition and fwrite() prototype from core.stdc.stdio? Why? ;) - If it was using the druntime struct, the problem would go away.
because I use dpp to include the headers of a c library called libxlsxwriter which in turn pulls in stdio.h which pulls in _IO_FILE.
another problem I see is that the _IO_FILE in druntine is out-of-date with the one in glibc. It was last updated in 2008. I have a way to hack around that, but its another manual step I would like to avoid.
Ah, yeah dpp doesn't work (at all?) with LDC due to its architecture and this issue - it causes lots of duplicate aggregate definitions (and so init symbol + TypeInfos bloat etc.), Atila knows about it.
luckily for me, for xlsxd its only _IO_FILE. Commenting that out makes it compile and pass the tests.
Has then been any idea/progress been made on this issue in general?
No progress, but there are some ideas, mainly in this issue here (see https://github.com/ldc-developers/ldc/issues/3021#issuecomment-471551875 for links to the other duplicate issues). It's not trivial, and the most promising idea (casting the function pointer after verifying that the signatures are basically equivalent, i.e., that the used aggregates are 'equivalent', although distinct separate types) wouldn't work in your case, as your _IO_FILE diverges from druntime's.
Has then been any idea/progress been made on this issue in general?
Yes, use DStep 😉
Why not have a sort of @TolerantMangling flag in the meantime so we can get around library conflicts
Sure, we can implement that as an attribute. I wonder whether that's worth it, though, or whether we should just allow it outright for simplicity (possibly with a warning hidden behind a command-line flag).
As I've mentioned in https://github.com/ldc-developers/ldc/issues/3318#issuecomment-584382009 it would be important to be able to override wrong declarations as well for legacy purposes.
+1 for the optional warning without extra UDA. Casting the function pointer should implicitly allow slightly wrong declarations, such as int/long for x64 targets if passed in a register.
I would rather not have warnings that can't be removed if my code is superior to the one it's overriding
Those warnings would be almost exclusively caused by duplicate struct declarations in multiple analyzed modules, so any 'superiority' would come with TypeInfos bloat (at best, just more stripping work for the linker). Settling for exactly one central D binding, either in druntime or some dub package, for each C(++) library would prevent this issue (and enforce DRY), but that's probably utopic.
Exactly, but the issue with type bloat is doomed to stay because of workarounds on top of workarounds due to compatibility reasons. If we had the perfect solution from the start it would've helped