opentelemetry-cpp icon indicating copy to clipboard operation
opentelemetry-cpp copied to clipboard

Error while exporting Metrics

Open Veeraraghavans opened this issue 2 years ago • 15 comments
trafficstars

Hello team,

I'm trying to use Opentelemetry Cpp version 1.8.1 to export my metrics from Ubuntu 22.04 machine . The plugin code that creates the agents, the provider to export the metrics. When I try to create the metrics provider, I get an allocation error. I'm not sure what's causing this error.

terminate called after throwing an instance of '
std::bad_alloc'
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc  what():  std::bad_alloc

I did some analysis using gdbgui to detail the problem and found that when MetaDataValidator is called, it triggers this regex and allocator validation and fails.

image

It would be nice if anyone has some idea on it. I am stuck on this for a while any inputs would be welcome. Happy to provide more details if needed

Veeraraghavans avatar Nov 03 '23 13:11 Veeraraghavans

@Veeraraghavans Which compiler? Also, do you have the sample code which is failing?

lalitb avatar Nov 03 '23 16:11 lalitb

Hi @lalitb

I use compiler version of gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0. Here is snippet of code which I use to create Meter

nostd::shared_ptr<metrics_api::Meter> MetricAgent::GetMeter()
{
    auto provider = metrics_api::Provider::GetMeterProvider();
   return provider->GetMeter(this->serviceName, OPENTELEMETRY_SDK_VERSION);
}

More information:

When I call GetMeter it calls Get Meter from MeterProvider. During creation of Meter in Opentelemetry, It calls InstrumentDataValidator where the regex error is thrown.

std::__cxx11::basic_regex<char, std::__cxx11::regex_traits<char> >::basic_regex<std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::regex_constants::syntax_option_type)

image

Please let me know is the information shared is enough or you need more.

Veeraraghavans avatar Nov 06 '23 08:11 Veeraraghavans

@Veeraraghavans - Do you get a similar crash while running - https://github.com/open-telemetry/opentelemetry-cpp/tree/main/examples/metrics_simple? Also, what is the otel-cpp version you are using? If it is from the main branch, do you also see the crash with v1.12.0?

lalitb avatar Nov 07 '23 19:11 lalitb

No I am not getting crash. I could run the Metrics_Simple example which you shared and Opentelemetry version i use is 1.8.1. I use Opentelemetry branch of 1.8.1

Veeraraghavans avatar Nov 08 '23 08:11 Veeraraghavans

Sorry, the stack trace is not enough for me to debug further. I can't see why allocation should fail in regex init. In case, someone want to comment/debug. Else, it would be helpful if you can provide a sample code (not the snippet) which fails consistently.

lalitb avatar Nov 09 '23 06:11 lalitb

@lalitb thanks for your reply. You have some idea about common reason for allocation failure at regex init. I can share the part of the code which fails as it is propriety code. I will check on giving access.

Veeraraghavans avatar Nov 09 '23 08:11 Veeraraghavans

@Veeraraghavans It would be more helpful if you could share the example ( in similar lines to https://github.com/open-telemetry/opentelemetry-cpp/tree/main/examples/metrics_simple ) which crashes on regex init. Something that can be easily compilable and reproducible to debug further.

lalitb avatar Nov 13 '23 19:11 lalitb

@lalitb please find the code which crashing during execution. Code has 2 parts one is

plugin.cpp - is the main code which creates the resources, Metric Agent.

#include "agents/MetricAgent.h"
void main()
{
    int processID = GetProcessID();
    //Create opentelemetry-cpp Resource to attach it to the telemetry data
    resource::ResourceAttributes attributes = {{"service.name", "ABC_PLUGIN"}, {"version", "latest"}, {"process_id", GetProcessID()}};   
    auto resource = resource::Resource::Create(attributes);
    std::string endpoint = "localhost:4317/v1/metrics";
    static ObservabilityPlugin::MetricAgent metricAgent( ABC_PLUGIN, GRPC, endpoint, resource);
    metricAgent.ActivateMetricType(ObservabilityPlugin::DefaultMetrics::All);  // Code calls ActivateMetricType function in MetricAgent.cpp 
	
}

// Get process Id:

int GetProcessID(){
  C_Communicator* com = C_Communicator::Instance();
  if(com == nullptr) return 0;
  if(com && com->size() > 1)  
    return com->cpuNum();
  else                        
    return 0;
}

agents/MetricAgent.cpp code - Creates Metric Exporter and Provider.

MetricAgent::MetricAgent(const std::string& serviceName, const std::string& protocol, const std::string& endpoint, resource::Resource resource, unsigned int frequency)
{
    this->serviceName = serviceName;
    auto attr = resource.GetAttributes();
    auto it = attr.find("process_id");
    if(it != attr.end()){
        this->processID = nostd::get<int>(it->second);
    }
    std::unique_ptr<metric_sdk::PushMetricExporter> exporter;
    this->metricGRPCExporterOptions.aggregation_temporality = metric_sdk::AggregationTemporality::kCumulative;
    this->metricGRPCExporterOptions.endpoint = endpoint;
    exporter = otlp::OtlpGrpcMetricExporterFactory::Create(metricGRPCExporterOptions);
    metric_sdk::PeriodicExportingMetricReaderOptions metricReaderOptions;
    metricReaderOptions.export_interval_millis = std::chrono::milliseconds(frequency);
    metricReaderOptions.export_timeout_millis  = std::chrono::milliseconds(frequency/2);
    std::unique_ptr<metric_sdk::MetricReader> reader{new metric_sdk::PeriodicExportingMetricReader(std::move(exporter), metricReaderOptions)};
    auto provider = std::shared_ptr<metrics_api::MeterProvider>(new metric_sdk::MeterProvider(std::unique_ptr<metric_sdk::ViewRegistry>(new metric_sdk::ViewRegistry()), resource));
   auto p        = std::static_pointer_cast<metric_sdk::MeterProvider>(provider);
   p->AddMetricReader(std::move(reader));	
   metrics_api::Provider::SetMeterProvider(provider);
}

// Function calls Metrics Meter Provider for adding Metrics counters
void MetricAgent::ActivateMetricType(DefaultMetrics type)
{
    auto meter = this->GetMeter();
    //This is place where the error is thrown where GetMeter function is called from plugin.cpp 
    switch (type)
    {
        //......
    }
}

nostd::shared_ptr<metrics_api::Meter> MetricAgent::GetMeter()
{
    auto provider = metrics_api::Provider::GetMeterProvider();
    return provider->GetMeter(this->serviceName, OPENTELEMETRY_SDK_VERSION);
}

Veeraraghavans avatar Nov 14 '23 15:11 Veeraraghavans

Given how the regexp crashes on the name given to GetMeter(), what is the actual value of serviceName ?

Does it looks properly initialized ?

marcalff avatar Nov 15 '23 16:11 marcalff

It gets following values, serviceName="abc_plugin" in the example and OPENTELEMETRY_SDK_VERSION=1.8.1. I think it is initialized fine as MetricAgent::MetricAgent(const std::string& serviceName, const std::string& protocol, const std::string& endpoint, resource::Resource resource, unsigned int frequency) executed fine but when I call Getmeter i have issues.

Do we have some methods to check on logs or some ways to check what happens ?

Veeraraghavans avatar Nov 15 '23 21:11 Veeraraghavans

Hey @lalitb @marcalff,

You think the usage of D_GLIBCXX_USE_CXX11_ABI flag will create issue?? Or any other reason you managed to get some idea. Any inputs will be helpful

Veeraraghavans avatar Nov 20 '23 10:11 Veeraraghavans

Hi @marcalff @lalitb

Did you get any idea on it? I tried debugging using SDK, The error is taking place at Regex Validation the value is passed exactly is "mapdl_plugin" and "1.8.1" when I disable it code proceeds but fails at Meter Creation counter.

[Error] File: /home/vsekar/observability-plugins/source/opentelemetry-cpp-v1.8/sdk/src/metrics/meter.cc:46Meter::CreateUInt64Counter - failed. Invalid parameters.mapdl_plugin_counter_nb_of_processes Number of processes . Measurements won't be recorded.

Entire code works fine for other example but fails if i call from my plugin code.

Veeraraghavans avatar Nov 30 '23 16:11 Veeraraghavans

This issue was marked as stale due to lack of activity.

github-actions[bot] avatar Jan 30 '24 01:01 github-actions[bot]