Gaffer icon indicating copy to clipboard operation
Gaffer copied to clipboard

Core REST API integration tests can timeout if run with the default test order

Open GCHQDeveloper314 opened this issue 3 years ago • 1 comments

The core-rest ITs have a problem where if the OperationServiceV1IT test is run after V2 test versions (potentially just OperationServiceV2IT) it does not work correctly and becomes stuck, never finishing and causing a CI timeout.

A TODO comment added to this class under #2299 looks to be related:

https://github.com/gchq/Gaffer/blob/1f8b7431870e5c276d2c8aa9f90eb9ab61eeecb3/rest-api/core-rest/src/test/java/uk/gov/gchq/gaffer/rest/service/v1/OperationServiceV1IT.java#L23

The problem occurs only when the V2 test class is executed first. In #2698 the test execution has been manually set to ensure the V2 tests are executed second, preventing any problems.

However, these ITs should be able to run and pass in any order, the fix above is not a complete solution. Because the test class order itself is intrinsic to this problem, debugging the class alone may not reveal the cause (which is presumably interaction between the two classes). A debugger can be attached to the Maven failsafe plugin to help with this. The problem can be seen with shouldReturnChunkedOperationChainElements (in OperationServiceIT.java) which appears to become stuck on a call to readChunkedElements.

GCHQDeveloper314 avatar Jul 04 '22 13:07 GCHQDeveloper314

More recently this is happening in the CI again, despite the changes made above.

GCHQDeveloper314 avatar Feb 08 '24 17:02 GCHQDeveloper314