ttrpc icon indicating copy to clipboard operation
ttrpc copied to clipboard

added the ttrpc stress utility

Open rawahars opened this issue 1 year ago • 3 comments

Summary

This pull request introduces a new utility, ttrpc-stress, designed to stress-test TTRPC connections. This tool represents a simple client-server interaction where the client sends continuous requests to the server, and the server responds with the same data, allowing for testing of concurrent request handling and response verification.

The stress utility was primarily developed to simulate deadlock issues between the client and server, particularly when different versions are used and is adapted from https://github.com/kevpar/ttrpcstress which was written by Kevin Parsons(https://github.com/kevpar).

Version Compatibility Testing

Known Issues in TTRPC Versions

Please refer to https://github.com/containerd/ttrpc/issues/72 for more information about the deadlock bug.

Version Range Description Comments
v1.0.2 and before Original deadlock bug #94 for fixing deadlock in v1.1.0
v1.1.0 - v1.2.0 No known deadlock bugs
v1.2.0 - v1.2.4 Streaming with a new deadlock bug #107 introduced deadlock in v1.2.0
After v1.2.4 No known deadlock bugs #168 for fixing deadlock in v1.2.4

The goal is to test the current version of TTRPC, which is used to build ttrpc-stress, against the following older versions in both server and client scenarios:

  • v1.0.2
  • v1.1.0
  • v1.2.0
  • v1.2.4
  • latest

Future Work

  • This is the initial cut of the tool being introduced in the latest version.
  • We will backport this tool into older versions i.e. v1.0.2, v1.1.0, v1.2.0, and v1.2.4.
  • A test will be added in the main wherein we will pull the older tags with tool as above, and then run the stress test matrix for server and client scenarios.

Note that we are on step 1 of the overall plan.

rawahars avatar Jan 14 '25 10:01 rawahars

@dmcgowan @kzys Can you please take a look at this addition?

rawahars avatar Jan 15 '25 15:01 rawahars

Hi @rawahars, thanks for working on this.

I see right now you are checking in a stress utility that can be run to stress-test a TTRPC server/client connection.

While this is a good starting point for the work, I think we need to have a plan for how this will be integrated into tests (and ultimately, run in CI) in this repo. This may change the approach taken in this PR. For instance, it may be better to check in tests using the stress code, rather than a binary that must be built.

I would recommend you outline first, here or in an issue, the overall approach to introducing CI testing for TTRPC deadlock issues. Then review can proceed from there.

kevpar avatar Jan 15 '25 20:01 kevpar

Thanks @kevpar. I have filed the issue https://github.com/containerd/ttrpc/issues/184 for discussions related to the introduction of regression testing for identifying potential deadlocks.

rawahars avatar Jan 18 '25 13:01 rawahars