dd-trace-js icon indicating copy to clipboard operation
dd-trace-js copied to clipboard

Fix unhandled rejection in kakfa commit tracking

Open jbertran opened this issue 1 year ago • 5 comments

What does this PR do?

Fix unhandled promise rejection in kafkajs producer instrumentation.

Taken from #4089 (thanks @Gilbert142!), removed the additional, unnecessary branch - we can do all the work inside the async hook resolution, since we don't have anything to do with commits in case of error.

I've added a test which, for some reason, always passes, despite reproducing the upstream error described in #4039 - an easy way to check is to apply the following patch:

diff --git a/packages/datadog-instrumentations/src/kafkajs.js b/packages/datadog-instrumentations/src/kafkajs.js
index 4e66f4cf0..9b063fb00 100644
--- a/packages/datadog-instrumentations/src/kafkajs.js
+++ b/packages/datadog-instrumentations/src/kafkajs.js
@@ -81,6 +81,8 @@ addHook({ name: 'kafkajs', file: 'src/index.js', versions: ['>=1.4'] }, (BaseKaf
             })
           )

+          result.then(() => {})
+
           return result
         } catch (e) {
           producerErrorCh.publish(e)

Whether or not this is there, the test succeeds, and looking at output yields a caught Kafka error, and no unhandled rejection. I'm probably being thick or missing something - second pair of 👀 welcome

Plugin Checklist

  • [x] Unit tests.

Security

Datadog employees:

  • [ ] If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • [x] This PR doesn't touch any of that.

Closes #4089 Fixes #4039

jbertran avatar Feb 27 '24 12:02 jbertran

Overall package size

Self size: 6.26 MB Deduped: 60.76 MB No deduping: 61.04 MB

Dependency sizes

name version self size total size
@datadog/native-iast-taint-tracking 1.7.0 16.71 MB 16.72 MB
@datadog/native-appsec 7.1.1 14.39 MB 14.4 MB
@datadog/pprof 5.2.0 8.84 MB 9.21 MB
protobufjs 7.2.5 2.77 MB 6.56 MB
@datadog/native-iast-rewriter 2.3.0 2.15 MB 2.24 MB
@opentelemetry/core 1.14.0 872.87 kB 1.47 MB
@datadog/native-metrics 2.0.0 898.77 kB 1.3 MB
@opentelemetry/api 1.4.1 780.32 kB 780.32 kB
import-in-the-middle 1.7.3 67.62 kB 731.01 kB
msgpack-lite 0.1.26 201.16 kB 281.59 kB
opentracing 0.14.7 194.81 kB 194.81 kB
semver 7.5.4 93.4 kB 123.8 kB
pprof-format 2.1.0 111.69 kB 111.69 kB
@datadog/sketches-js 2.1.0 109.9 kB 109.9 kB
lodash.sortby 4.7.0 75.76 kB 75.76 kB
lru-cache 7.14.0 74.95 kB 74.95 kB
ipaddr.js 2.1.0 60.23 kB 60.23 kB
ignore 5.2.4 51.22 kB 51.22 kB
int64-buffer 0.1.10 49.18 kB 49.18 kB
shell-quote 1.8.1 44.96 kB 44.96 kB
istanbul-lib-coverage 3.2.0 29.34 kB 29.34 kB
tlhunter-sorted-set 0.1.0 24.94 kB 24.94 kB
limiter 1.1.5 23.17 kB 23.17 kB
dc-polyfill 0.1.4 23.1 kB 23.1 kB
retry 0.13.1 18.85 kB 18.85 kB
node-abort-controller 3.1.1 16.89 kB 16.89 kB
jest-docblock 29.7.0 8.99 kB 12.76 kB
crypto-randomuuid 1.0.0 11.18 kB 11.18 kB
path-to-regexp 0.1.7 6.78 kB 6.78 kB
koalas 1.0.2 6.47 kB 6.47 kB
methods 1.1.2 5.29 kB 5.29 kB
module-details-from-path 1.0.3 4.47 kB 4.47 kB

🤖 This report was automatically generated by heaviest-objects-in-the-universe

github-actions[bot] avatar Feb 27 '24 12:02 github-actions[bot]

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 85.23%. Comparing base (bf9f356) to head (45676fc).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4112      +/-   ##
==========================================
- Coverage   85.37%   85.23%   -0.14%     
==========================================
  Files         247      247              
  Lines       10961    10961              
  Branches       33       33              
==========================================
- Hits         9358     9343      -15     
- Misses       1603     1618      +15     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Feb 27 '24 12:02 codecov[bot]

Benchmarks

Benchmark execution time: 2024-04-04 14:20:54

Comparing candidate commit 45676fc0556a257ae266b9c5cdff6cbf7c2dc5e3 in PR branch jbertran/fix-kafka-commit-unhandled with baseline commit bf9f35642e87b44bcb9048ff31f588359f09c7ad in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 260 metrics, 6 unstable metrics.

pr-commenter[bot] avatar Feb 27 '24 13:02 pr-commenter[bot]

@jbertran Thanks for moving on this so quickly! 🎉

On the new unit test, I think the .catch will always fire with or without the patch because we do return the original promise, we just don't return the second .then branch. So the original promise is returned and properly caught, but that second .then branch creates a new promise which is never returned, and never caught. To test it we'd probably need to bind a listener to the unhandledRejection event, and fail the test if we see anything in there. (Unless mocha already fails on those by default; I'm not sure.)

Gilbert142 avatar Feb 28 '24 12:02 Gilbert142

On the new unit test, I think the .catch will always fire with or without the patch because we do return the original promise, we just don't return the second .then branch. So the original promise is returned and properly caught, but that second .then branch creates a new promise which is never returned, and never caught. To test it we'd probably need to bind a listener to the unhandledRejection event, and fail the test if we see anything in there. (Unless mocha already fails on those by default; I'm not sure.)

Makes sense. A quick look reveals that mocha sets its own black-hole handler for unhandledRejection - adding a rethrowing handler makes the issue (and the fix) apparent. I'm removing said test though, since it's artificial now that my initial oversight is fixed.

Thanks!

jbertran avatar Feb 28 '24 13:02 jbertran

Hey all, is there anything I can do to help land this fix?

Gilbert142 avatar Apr 03 '24 13:04 Gilbert142