c-sdk icon indicating copy to clipboard operation
c-sdk copied to clipboard

Blocker: SDK infrequently fails to create transactions, no enough information to debug

Open jayv opened this issue 4 years ago • 0 comments

Describe the bug

In one of our applications we're infrequently seeing failures when calling newrelic_start_web_transaction() we get a NULL pointer, it would be good to get provide more context such that the application can take appropriate action, but also provide more debug logs in the SDK to understand the root cause of the failure. We have no way of figuring out what happened nor how the application logic should deal with this.

Expected Behavior

Creating transactions should never fail after establishing a connection.

Troubleshooting

This is what we see in our logs with debug logging turned on for the SDK after successfully sending thousands of transactions something

[...]
2020-09-03 00:04:59.999 +0000 (5547 5560) verbosedebug: sending transaction message, len=1540
2020-09-03 00:05:00.009 +0000 (5547 5560) error: unable to start transaction
2020-09-03 00:05:00.009 +0000 (5547 5561) verbosedebug: querying app='XXXX_REDACTED' from parent=5
[2020-09-03 00:05:00.009] [thread-5560] [instrumentation] [error] Newrelic: failed to create Transaction (YYYY_REDACTED)
2020-09-03 00:05:00.010 +0000 (5547 5561) verbosedebug: sending appinfo message, len=432

What I notice is that we first see error: unable to start transaction and then see verbosedebug: querying app='XXXX_REDACTED' from parent=5 in a new thread, which is something we typically only see once at the start of the application. It suggest it lost the connection to the agent, but it's running on localhost....

Steps to Reproduce

Send a bunch of metrics from various threads, eventually creating a web tx fails.

Your Environment

SDK v 1.3.0

Feature Request

Instead of just returning a pointer from newrelic_start_web_transaction() take in a pointer to a structure that will be filled in with the pointer to the new transaction or optionally an error message and code such that the application can judge what to do next. Eg. crash hard if it's a fatal error or buffer metrics for some time if only temporary, etc....

jayv avatar Sep 03 '20 01:09 jayv