VulkanSceneGraph icon indicating copy to clipboard operation
VulkanSceneGraph copied to clipboard

typeid / type_info equality check fails for clang/libc++ when using VSG in dynamic libraries

Open martinweber opened this issue 11 months ago • 22 comments

Issue found

I am posting this here to discuss the following behavior I found:

I have been running into issues when using VSG in a dynamic library, where the cast<> function on vsg::Object did return a nullptr even when the type was correct. This is caused by how clang/libc++ is generating type_info.hash_code()and the related type_info comparison operator.

For example, in a vsg::Visitor: (Note: this also happens in another area where we are using vsg::Object::cast())

    void apply(vsg::Object& obj) override
    {
        auto* matrix_node = obj.cast<vsg::MatrixTransform>();
        // ...
    }

This always returned a nullptr even when the object was of type vsg::MatrixTransform. Using dynamic_cast<vsg::MatrixTransform*>(&obj) instead returned a pointer to the vsg::MatrixTransform.

I then logged the values for std::type_info in vsg::Inherit::is_compatible():

Subclass: N3vsg15MatrixTransformE, 4737813731 ? is_compatible: N3vsg15MatrixTransformE, 4577409791
Subclass: N3vsg9TransformE, 4737813807 ? is_compatible: N3vsg15MatrixTransformE, 4577409791
Subclass: N3vsg5GroupE, 4737813148 ? is_compatible: N3vsg15MatrixTransformE, 4577409791
Subclass: N3vsg4NodeE, 4737813255 ? is_compatible: N3vsg15MatrixTransformE, 4577409791
type_info object         : 4737813731  // at callsite of vsg::Object::cast()
type_info MatrixTransform: 4737813731 // at callsite of vsg::Object::cast()

(Note: the type after "? is_compatible:" is from the type parameter.)

  • The type_info.name() returned the same value (N3vsg15MatrixTransformE)
  • The type_info.hash_code() returned different values even though the type (type_info.name()) was identical
  • The type_info comparison operator returns false for matching types

The type_info.hash_code() is identical at the call site, but differs in the type's is_compatible() function. The call site is in a different dynamic library than VSG, which is linked as static library into a different dynamic library.

This seems to be an issue with clang/libc++ when using dynamic libraries. I have found discussions about this here and here.

The issue seems to be present when dynamic libraries are loaded using RTLD_LOCAL. Symbols tables are then local to the library and the type_info.hash_code() for the same type is different. Also, the comparison operator on std::type_info returns false in this case.

Environment

  • macOS 13.4.1 (c) (Ventura)
  • CPU: Apple M1 Pro
  • Apple clang version 14.0.3 (clang-1403.0.22.14.1)

The same code works correctly on Windows with MSVC!

Possible fixes?

  1. Using a strcmp() with type_info.name()? The type_info.name() is working correctly in this case. This is the solution pybind was going for. This requires a strcmp() which is computationally much more expensive than the current code. Especially considering, that in case of type difference, is_compatible() is called recursively for parent types.

  2. Implement a type_hash<> template similar to type_name<> found in type_name.h that will guarantee to return an identical value for identical types?

  3. something else?

Conclusion

We already have two known places where this breaks our application on macOS (and possibly Linux). For now, a dynamic_cast<> instead of using vsg::Object::cast() is a working alternative. Comparing type_name() values also would work.

I fear that this behavior of clang/libc++ will cause more issues though.

Thanks!

martinweber avatar Aug 04 '23 11:08 martinweber