Error with TransferTerminationMessage when terminated by Provider
Description - What happened?
Aborting transfers by the Provider because of an internal error leads to a follow up error with the TransferTerminationMessage. As a follow up in the EDC-UI of the consumer, the transfer process will continue to rotate bars indefinitely.
Expected Behavior
TransferTerminationMessage send by Provider should also lead to a (graphical) termination of the transfer process at the Consumer after the Provider canceled the transfer because of an internal error.
Observed Behavior
If the provider cancels a data transfer requested by a consumer immediately because it encounters an internal error in its setup, such as an unreachable data plane or not being able to build the HttpDataSource, the provider then notifies the consumer of the termination with a TransferTerminationMessage. However, during sending the message also a failure is logged at the provider: "404 - Transferprocess with corellationId ... not found". As a consequence the consumer remains in the Requested status for the transfer while the provider has been in Terminated status for a long time.
It is unclear whether the 404-response comes from the Consumer in response to receiving the termination-message from the Provider or whether it is another internal error from the Provider himself.
Steps to Reproduce
There are no real steps to reproduce this, as this must be preceded by an internal error on the provider's side when initiating the transfer that was requested by a consumer. This is therefore a subsequent error.
Context Information
Additional Findings
Occurred in two independent transfer terminations on the provider side:
- Provider: Internal error because Dataplane could not be found -> Provider termination -> Consumer stays in Requested
- Provider: Internal error when processing the HttpDataSource -> Provider termination -> Consumer stays in Requested
Hypothesis / Possible Root Cause
- SO hints that bug could be cause by an issue in the state machine.
- State requested has ID 500, no relation to HTTP
- Error 500 - could also be state machine
- It is one of our EDCs - we can start debugging
Timeline / Priority
We want to fix this issue before the launch of MDS 2.1 so that on-premise customers of MDS do not have to do a second update at a later stage; this will ensure a smoother customer experience.
Screenshots
As a follow up in the EDC-UI of the consumer, the transfer process will continue to rotate bars indefinitely. Like here for a transfer, that was started 20 days ago but meanwhile terminated by the Provider.
Workaround idea
As a workaround, the Consumer could consider the transfer to have been terminated after a certain amount of time x if he does not make any status changes for a certain period of time on that Transferprocess.
Stakeholders
@ip312 @jkbquabeck
@SebastianOpriel @AbdullahMuk Who can work on the workaround and how much time will the workaround take?
@ununhexium ?
@ununhexium any status update here?
@SebastianOpriel Nothing new, I didn't touch this issue for months.