dd-trace-js
dd-trace-js copied to clipboard
Cache external git call results
What does this PR do?
It's expensive to call programs. And with the previous implementation, git would be called a few times for each test, which causes dramatic slowdowns when there are thousands of tests.
We can safely memoize these values. It would be incredibly unusual to have the current version of the code changed out from under us, and in that circumstance, we already cannot provide any sort of reasonable behavior.
Motivation
Adding the dd-trace agent to our tests in CI caused massive slowdowns:

With this PR in place, there's no longer any obvious hot spot:

Plugin Checklist
- [ ] Unit tests.
- [ ] TypeScript definitions.
- [ ] TypeScript tests.
- [ ] API documentation.
- [ ] CircleCI jobs/workflows.
- [ ] Plugin is exported.
Additional Notes
Not sure if we get any special treatment, but I'm submitting this on behalf of ONE.
And with the previous implementation, git would be called a few times for each test, which causes dramatic slowdowns when there are thousands of tests
@juan-fernandez Doesn't this actually happen only once per plugin?
Generally speaking I agree with the idea of memoizing the result, although we should also make sure that it doesn't happen more often than it should regardless of any optimizations. It would also make sense I think to make sure that the function runs only when the plugin is enabled and not just when it's initialized, to avoid calling it for CI integrations that are not even used. This is especially true since it's highly unlikely that multiple test runners are used at the same time.
@juan-fernandez Do we have any existing benchmarks that would help quantify the difference and avoid future regressions?
And with the previous implementation, git would be called a few times for each test, which causes dramatic slowdowns when there are thousands of tests
@juan-fernandez Doesn't this actually happen only once per plugin?
Generally speaking I agree with the idea of memoizing the result, although we should also make sure that it doesn't happen more often than it should regardless of any optimizations. It would also make sense I think to make sure that the function runs only when the plugin is enabled and not just when it's initialized, to avoid calling it for CI integrations that are not even used. This is especially true since it's highly unlikely that multiple test runners are used at the same time.
yeah it should happen only once per plugin. I'll check that.
Also, agreed with the point: we should only run git commands if the plugin is being used, not just initialized. I'll look into that.
I figured it out--I was using it wrong.
We have a .js file that runs dd.init, since we want to customize things a bit. I put this file in jest's setupFiles field. I thought that meant that it only gets run once, but those files actually get executed once per test.
Functional alternatives:
Leaving this open in case you want to use it to track something else, but things are all resolved on my side.
I figured it out--I was using it wrong.
We have a .js file that runs
dd.init, since we want to customize things a bit. I put this file in jest'ssetupFilesfield. I thought that meant that it only gets run once, but those files actually get executed once per test.Functional alternatives:
Leaving this open in case you want to use it to track something else, but things are all resolved on my side.
Thanks for the update! I think this change still makes sense regardless, so we're working on defining how to land this or something similar to this.
This is not the first time that init is called in a hot path from misuse, so we should probably add a benchmark looping through the method to ensure that it doesn't have too big of a performance impact to avoid causing issues. Basically, misuse of dd-trace should only cause issues with dd-trace, not with the application.
@juan-fernandez are you tracking this change request somewhere else? And if so is it okay if I close this PR?
let's close it