azure-sdk-for-java icon indicating copy to clipboard operation
azure-sdk-for-java copied to clipboard

[BUG] Cosmos hangs forever with CosmosEndToEndOperationLatencyPolicyConfig set

Open lnist opened this issue 8 months ago • 4 comments

Describe the bug Certain operations cause the Cosmos SDK to hang forever and certain operations do not respect the timeout set by CosmosEndToEndOperationLatencyPolicyConfig.

It seems the hangs occur for operations that span partitions.

To Reproduce See this example repository and test: https://github.com/lnist/cosmos-sdk-hang/blob/main/src/test/java/cosmosTimeouts.java

In the test you need to fill in the connection string and master key for cosmos.

The test utilizes WireMock to simulate a delay in accessing the cosmos backend. For this a self-signed certificate is used, since the Cosmos SDK insists on using HTTPS.

If you execute the tests then they are all expected to fail due to timeout from the Cosmos SDK. That does not happen.

The readAllContainers and properties tests both return the desired data, but it takes longer than the configured timeout of 1 second. They should fail instead.

The readNonDefaultPartitionKey, count, readAll, and writeBulk all respect the timeout of 1 second if the DELAY parameter is set to 2_000, but they hang forever (until the test timeout of 1 minutes) if the DELAY parameter is set to 10_000.

Note: The code includes a couple of configurations that I think are redundant, but they were used during extensive testing, so I did not want to change them. A quick test without them seems to indicate the issues are present with default parameters (except of course for the CosmosEndToEndOperationLatencyPolicyConfig)

Code Snippet Add the code snippet that causes the issue.

Expected behavior The API uses the configured timeout.

Setup (please complete the following information):

  • OS: Windows 11
  • IDE: IntelliJ
  • Library/Libraries: com.azure:azure-cosmos:4.61.1
  • Java version: 21
  • App Server/Environment: jupiter test runner
  • Frameworks: N/A

Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • [x] Bug Description Added
  • [x] Repro Steps Added
  • [x] Setup information Added

lnist avatar Jun 24 '24 14:06 lnist