opentelemetry-cpp
opentelemetry-cpp copied to clipboard
Assertion `false && "i == variant_npos"' failed with -DWITH_STL=ON
Describe your environment
OS: CentOS Stream release 8
Compiler: gcc (GCC) 10.3.1 20210422 (Red Hat 10.3.1-1)
OpenTelemetry libs compiled today (2/9/2022) from main branch using following command:
cmake -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DBUILD_SHARED_LIBS=ON -DWITH_JAEGER=ON -DWITH_STL=ON ..
Steps to reproduce Use this code:
#include <opentelemetry/exporters/jaeger/jaeger_exporter.h>
#include <opentelemetry/sdk/trace/simple_processor.h>
#include <opentelemetry/sdk/trace/tracer_provider.h>
#include <opentelemetry/trace/provider.h>
namespace sdktrace = opentelemetry::sdk::trace;
namespace nostd = opentelemetry::nostd;
namespace trace = opentelemetry::trace;
int main()
{
// Jaeger HTTP exporter
opentelemetry::exporter::jaeger::JaegerExporterOptions opts;
opts.transport_format = opentelemetry::exporter::jaeger::TransportFormat::kThriftHttp;
opts.endpoint = "localhost";
opts.server_port = 6831;
opts.headers = {{}}; // optional headers
auto jaeger_udp_exporter =
std::unique_ptr<sdktrace::SpanExporter>(new opentelemetry::exporter::jaeger::JaegerExporter(opts));
auto resource_attributes = opentelemetry::sdk::resource::ResourceAttributes
{
{"service.name", "test"},
{"service.instance.id", "test-123"}
};
auto resource = opentelemetry::sdk::resource::Resource::Create(resource_attributes);
return 0;
}
Compile using this command:
g++ -o test test.cc -O0 -g -std=c++11 -Wall -Wextra -Werror -Wno-unused-parameter -I/build/otlp/debug-stl/include -L/build/otlp/debug-stl/lib64 -Wl,-rpath,/build/otlp/debug-stl/lib64 -lopentelemetry_exporter_jaeger_trace -lopentelemetry_resources -lopentelemetry_trace
What is the expected behavior? No crash when app is started
What is the actual behavior? App crashes with following message:
test: /build/otlp/debug-stl/include/opentelemetry/nostd/./internal/absl/types/../types/internal/variant.h:426: absl::otel_v1::variant_internal::VisitIndicesSwitch<EndIndex>::Run<absl::otel_v1::variant_internal::VariantStateBaseDestructorNontrivial<bool, int, unsigned int, long int, double, std::__cxx11::basic_string<char, std::char_traits
Additional context
valgrind --num-callers=50 --track-origins=yes ./test
shows that this is caused by uninitialized heap memory access. Here are few bottom callstack frames for this allocation:
==19559== by 0x4F105F3: opentelemetry::v1::sdk::resource::Resource::Merge(opentelemetry::v1::sdk::resource::Resource const&) (in /build/otlp/debug-stl/lib64/libopentelemetry_resources.so)
==19559== by 0x4F106E8: opentelemetry::v1::sdk::resource::Resource::Create(opentelemetry::v1::sdk::common::AttributeMap const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (in /build/otlp/debug-stl/lib64/libopentelemetry_resources.so)
==19559== by 0x41E6BA: main (test.cc:26)
App does not crash when Jaeger exporter is not created, when only resource
is created - this is somehow related to this exporter.
Libraries compiled with -DWITH_STL=OFF
works fine.
-DWITH_STL=ON
cmake option will set the HAVE_CPP_STDLIB
pre-processor macro. You need to set this macro explicitly while compiling your test code using g++, something like below:
g++ -DHAVE_CPP_STDLIB -o test test.cc -O0 -g -std=c++11 -Wall -Wextra -Werror -Wno-unused-parameter -I/build/otlp/debug-stl/include -L/build/otlp/debug-stl/lib64 -Wl,-rpath,/build/otlp/debug-stl/lib64 -lopentelemetry_exporter_jaeger_trace -lopentelemetry_resources -lopentelemetry_trace
If you are still getting the error, please refer to the exact g++ command invoked for examples/jaeger
through cmake, and give similar arguments in the compilation of your example code.
Thanks, this fixed my problem. Libraries compiled from sources usually installs their config.h
file which stores all defines like this one, so I assumed that I do not have to add it manually. It would be good if you could follow this approach too.
BTW, I also had to switch to C++17 - with C++11 I got tons of compilation errors after I added -DHAVE_CPP_STDLIB
.
BTW, I also had to switch to C++17 - with C++11 I got tons of compilation errors after I added -DHAVE_CPP_STDLIB.
Yes, that's expected. It tries to use std::variant, std::span from the STL library if HAVE_CPP_STDLIB
is enabled, and so fails with C++11.
BTW, I also had to switch to C++17 - with C++11 I got tons of compilation errors after I added -DHAVE_CPP_STDLIB.
Yes, that's expected. It tries to use std::variant, std::span from the STL library if
HAVE_CPP_STDLIB
is enabled, and so fails with C++11.
My point here is that header file(s) could have some #if
at the beginning to check C++ version and #error
early.
**My point here is that header file(s) could have some #if at the beginning to check C++ version and #error early.
Agree, HAVE_CPP_STDLIB is the half-baked solution, not prioritized as it is not recommended to be used to maintain ABI compatibility. Also, there is a somewhat related issue #1071 to have more granular HAVE_CPP_STDLIB support ( tagged as help wanted
). Feel free to contribute to the change you suggested to fail early with proper error.
This issue was marked as stale due to lack of activity. It will be closed in 7 days if no furthur activity occurs.
We use cmake CONFIG file and bazel dependency to auto declaration macros now. We can use configure_file to generate header file which include all macros during configure time.But I'm not sure if bazel has similar function.
This issue was marked as stale due to lack of activity.
I'm having the same assert from ABSL in our environment with OpenTelemetry 1.8.3, build on x64 windows with VS 2019 (16.11.21), In our case the error came when after the first async counter was read. In sdk\src\metrics\state\temporal_metric_storage.cc:138
there is this lambda:
result_to_export->GetAllEnteries(
[&metric_data](const MetricAttributes &attributes, Aggregation &aggregation) {
PointDataAttributes point_data_attr;
point_data_attr.point_data = aggregation.ToPoint();
point_data_attr.attributes = attributes;
metric_data.point_data_attr_.emplace_back(std::move(point_data_attr));
return true;
});
In our case when the point_data_attr
is created, the point_data_attr.point_data
variant has an incorrect index value. ( 0xcccccccccccccc00
)
Due to this invalid value, when the read value was moved into the point_data_attr.point_data = aggregation.ToPoint();
then it tries to destroy the original value, but since the index is invalid it runs into the assert in api\include\opentelemetry\nostd\internal\absl\types\internal\variant.h:426
.
I've compiled the OpenTelemetry with STL + cpp17 support, and the issue is gone. So it seems the variant
in Abseil is bogus. Can someone tell me which abseil version was used in OpenTelemetry 1.8.3?
I've compiled the OpenTelemetry with STL + cpp17 support, and the issue is gone. So it seems the
variant
in Abseil is bogus. Can someone tell me which abseil version was used in OpenTelemetry 1.8.3?
@elnoir the versions used can be found in https://github.com/open-telemetry/opentelemetry-cpp/blob/main/third_party_release
I am having exactly the same kind of issues described by @elnoir. Except not using the metrics SDK, but trying to get propagation to working on a GRPC server.
There seems to be some kind of issue with the abseil version in GRPC and opentelemetry conflicting?
I am using vcpkg to install both grpc and opentelemetry-cpp. ("builtin-baseline": "c8696863d371ab7f46e213d8f5ca923c4aef2a00") I tried customizing the abseil build used in vcpkg, in order to customize the cmake flags, but to no avail.
Any help would be greatly appreciated.
@StefanVanDyck
I also had the same issue recently.
The problem was that some code is compiled with HAVE_ABSEIL, other code is not, causing the build to be incoherent.
When building the opentelemetry-cpp library, HAVE_ABSEIL will only be defined if the CMake option WITH_ABSEIL is set.
I think we may need to cleanup this area, and use something like OTEL_HAVE_ABSEIL to avoid side effects with other codes, like grpc.
@marcalff
Thank you so much for the response, I was slowly loosing my sanity trying to figure out what I was doing wrong. (Not an experienced C++ developer)
The vcpkg version of opentelemetry should be compiled using the -DWITH_ABSEIL=ON
cmake flag.
I tried using -DWITH_ABSEIL=OFF -DWITH_STL=ON
in the opentelemetry-cpp build, but then I end up getting weirdness with the shared_ptr destructor causing a SIGSEGV.
So I guess what you are saying, is that the otel GRPC exporter (which I am also using) is compiled using GRPC with abseil. Whereas the actual GRPC library is not? So I might try overriding the GRPC build to make sure it is actually using abseil?
Again, thanks for the help!
@StefanVanDyck
I also had the same issue recently.
The problem was that some code is compiled with HAVE_ABSEIL, other code is not, causing the build to be incoherent.
When building the opentelemetry-cpp library, HAVE_ABSEIL will only be defined if the CMake option WITH_ABSEIL is set.
I think we may need to cleanup this area, and use something like OTEL_HAVE_ABSEIL to avoid side effects with other codes, like grpc.
Does HAVE_ABSEIL
will be automatically added when link opentelemetry_api
when using cmake and it's compiled with WITH_ABSEIL
? And all components depends opentelemetry_api
and this macro should be auto declared when link any component of otel-cpp. I think this problem will only happen when users use cmake to compile otel-cpp and do not use cmake CONFIG package to import it. Just like my comments before , some project use configure_file to generate a header file to put all environemnt detection options in it, but I'm not familar with bazel and don't know how to do it with bazel. Or can we add a generated file just for cmake to avoid this problem?
BTW: I think gRPC do not solve this problem, because when we compile and use it with different versions of abseil-cpp, it won't work.
@owent
I am using these lines in my CmakeLists.txt
to link to the opentelemetry libraries.
find_package(opentelemetry-cpp CONFIG REQUIRED)
target_link_libraries(${PROJECT_NAME} PRIVATE ${OPENTELEMETRY_CPP_LIBRARIES})
Is that what you mean when you mention the "cmake CONFIG package"? Should I do it in some other way? Thanks!
Success! I think?
Finally go things to work using the standard vcpkg libs after much trail and error. Not entirely sure what the magic ingredients was, though adding
#define OPENTELEMETRY_STL_VERSION 2017
before including any opentelemetry headers might have been important?
Edit 1: This just caused other issues, my gTest test suite suddenly started working again, but then my normal executable target got weird... (i.e. SIGSEGV on creating a span context) Sorry for polluting this issue, pretty clear I am in a bit over my head here.
Edit 2: Actual Success! 🤞 Modified the vcpkg build using an overlay to have:
-DCMAKE_CXX_STANDARD=20
-DWITH_ABSEIL=OFF
-DWITH_STL=CXX20
And included #define OPENTELEMETRY_STL_VERSION 2020
before including any opentel headers.
Thank you for the help.
@owent
I am using these lines in my
CmakeLists.txt
to link to the opentelemetry libraries.find_package(opentelemetry-cpp CONFIG REQUIRED) target_link_libraries(${PROJECT_NAME} PRIVATE ${OPENTELEMETRY_CPP_LIBRARIES})
Is that what you mean when you mention the "cmake CONFIG package"? Should I do it in some other way? Thanks!
Yes, it is. But why do you target_link_libraries(${PROJECT_NAME} PRIVATE ${OPENTELEMETRY_CPP_LIBRARIES})
instead of target_link_libraries(${PROJECT_NAME} PUBLIC ${OPENTELEMETRY_CPP_LIBRARIES})
?Only macros, compile options and link options of public dependencies will be auto added, if there is any target depends this ${PROJECT_NAME}
and ${PROJECT_NAME}
is also exported as CONFIG package.
I ran into the same issue with C++ 17, opentelemetry-cpp-1.13.0, with Bazel 5.4.1.
main: bazel-out/k8-fastbuild/bin/external/io_opentelemetry_cpp/api/_virtual_includes/api/opentelemetry/nostd/./internal/absl/types/../types/internal/variant.h:426: absl::otel_v1::variant_internal::VisitIndicesSwitch<EndIndex>::Run(Op&&, std::size_t) [with Op = absl::otel_v1::variant_internal::VariantCoreAccess::MoveAssignVisitor<absl::otel_v1::variant_internal::VariantMoveAssignBaseNontrivial<bool, int, unsigned int, long int, double, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<bool, std::allocator<bool> >, std::vector<int, std::allocator<int> >, std::vector<unsigned int, std::allocator<unsigned int> >, std::vector<long int, std::allocator<long int> >, std::vector<double, std::allocator<double> >, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, long unsigned int, std::vector<long unsigned int, std::allocator<long unsigned int> >, std::vector<unsigned char, std::allocator<unsigned char> > > >; long unsigned int EndIndex = 15; absl::otel_v1::variant_internal::VisitIndicesResultT<Op, long unsigned int> = void; std::size_t = long unsigned int]::<lambda()>: Assertion `false && "i == variant_npos"' failed.
I tried the above suggestion, adding or removing HAVE_ABSEIL, HAVE_CPP_STDLIB, but it makes no difference. Problem solved (at least for me) by upgrading to the latest client version, which at the time 1.14.1. I have the HAVE_ABSEIL macro on with the latest version and it succeeded.