apm-agent-go icon indicating copy to clipboard operation
apm-agent-go copied to clipboard

fix: middleware panics on ended transaction

Open eugene-trifonov opened this issue 3 years ago • 7 comments
trafficstars

When transaction is ended already, apmechov4 middleware panics, because tx.Result is tx.TransactionData.Result, and TransactionData == nil when transaction is ended. This commit fixes the problem.

eugene-trifonov avatar Aug 12 '22 19:08 eugene-trifonov

💚 CLA has been signed

:grey_exclamation: Build Aborted

The PR is not allowed to run in the CI yet

the below badges are clickable and redirect to their specific view in the CI or DOCS Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Start Time: 2023-03-31T17:20:34.510+0000

  • Duration: 4 min 56 sec

Steps errors 2

Expand to view the steps failures

Load a resource file from a library
  • Took 0 min 0 sec . View more details here
  • Description: approval-list/elastic/apm-agent-go.yml
Error signal
  • Took 0 min 0 sec . View more details here
  • Description: githubApiCall: The REST API call https://api.github.com/orgs/elastic/members/eugene-trifonov return the message : java.lang.Exception: httpRequest: Failure connecting to the service https://api.github.com/orgs/elastic/members/eugene-trifonov : httpRequest: Failure connecting to the service https://api.github.com/orgs/elastic/members/eugene-trifonov : Code: 404Error: {"message":"User does not exist or is not a member of the organization","documentation_url":"https://docs.github.com/rest/reference/orgs#check-organization-membership-for-a-user"}

:robot: GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • run benchmark tests : Run the benchmark test.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

apmmachine avatar Aug 12 '22 19:08 apmmachine

@eugene-trifonov thanks for opening the PR!

What is the use case for ending the transaction in the handler, and not in the middleware? It's expected that the middleware ends the transaction.

axw avatar Aug 14 '22 09:08 axw

What is the use case for ending the transaction in the handler, and not in the middleware? It's expected that the middleware ends the transaction.

@axw I found it in case when http connection is patched to be websocket connection. And as a result such transaction longs too much time (hours). I wanted to split such huge transaction to smaller ones to make it more convenient for reviewing, and first transaction is closing right before I patch http connection to websocket connection. Once the websocket connection is ended, I receive panic from this middleware.

eugene-trifonov avatar Aug 15 '22 12:08 eugene-trifonov

@eugene-trifonov thanks for the info.

I wanted to split such huge transaction to smaller ones to make it more convenient for reviewing, and first transaction is closing right before I patch http connection to websocket connection.

Do you still want to capture the transaction created in the middleware, or would it be an option to not create a transaction in the first place for that route?

If that is an option, you could use https://pkg.go.dev/go.elastic.co/apm/module/apmechov4/v2#WithRequestIgnorer to not create these transactions. Then you can create transactions manually in your websocket handler.

axw avatar Aug 16 '22 05:08 axw

@eugene-trifonov thanks for the info.

I wanted to split such huge transaction to smaller ones to make it more convenient for reviewing, and first transaction is closing right before I patch http connection to websocket connection.

Do you still want to capture the transaction created in the middleware, or would it be an option to not create a transaction in the first place for that route?

If that is an option, you could use https://pkg.go.dev/go.elastic.co/apm/module/apmechov4/v2#WithRequestIgnorer to not create these transactions. Then you can create transactions manually in your websocket handler.

@axw Yeah, I could, actually I've tried already, but such solution has some duplications I would want to have. Currently I use apm middleware before logging middleware, so any log has trace.id/transaction.id, and logger writes log about call to endpoint. So either I need to copy-paste logging, checking error for different logging levels and tracing ending with/without error, either I fix apm middleware.

I have chosen second option, because I was thinking it's not cool that middleware can panic and have memory leak (on body.Discard() isn't calling when panics) just because transaction is ended already. This fix just adds flexibility to the middleware I think.

eugene-trifonov avatar Aug 16 '22 06:08 eugene-trifonov

Currently I use apm middleware before logging middleware, so any log has trace.id/transaction.id, and logger writes log about call to endpoint. So either I need to copy-paste logging, checking error for different logging levels and tracing ending with/without error, either I fix apm middleware.

Won't that transaction.id be wrong/irrelevant though? i.e. it would refer to the automatically-generated transaction, which I assume you're discarding?

I have chosen second option, because I was thinking it's not cool that middleware can panic and have memory leak (on body.Discard() isn't calling when panics) just because transaction is ended already. This fix just adds flexibility to the middleware I think.

Yes it adds flexibility, but it's also adding a bit of complexity and sets a precedent for other middleware - hence why I'm a bit wary of the change. I don't think it's unreasonable for code to panic if used in a way that is not intended. I understand it doesn't work for your use case so let's try and explore different options.

Would you be able to share a small program which exhibits the problem? Ideally with the logging you're expecting as well.

axw avatar Aug 24 '22 07:08 axw