edc-ce icon indicating copy to clipboard operation
edc-ce copied to clipboard

Error with TransferTerminationMessage when terminated by Provider

Open tmberthold opened this issue 2 years ago • 4 comments

Description - What happened?

Aborting transfers by the Provider because of an internal error leads to a follow up error with the TransferTerminationMessage. As a follow up in the EDC-UI of the consumer, the transfer process will continue to rotate bars indefinitely.

Expected Behavior

TransferTerminationMessage send by Provider should also lead to a (graphical) termination of the transfer process at the Consumer after the Provider canceled the transfer because of an internal error.

Observed Behavior

If the provider cancels a data transfer requested by a consumer immediately because it encounters an internal error in its setup, such as an unreachable data plane or not being able to build the HttpDataSource, the provider then notifies the consumer of the termination with a TransferTerminationMessage. However, during sending the message also a failure is logged at the provider: "404 - Transferprocess with corellationId ... not found". As a consequence the consumer remains in the Requested status for the transfer while the provider has been in Terminated status for a long time.

It is unclear whether the 404-response comes from the Consumer in response to receiving the termination-message from the Provider or whether it is another internal error from the Provider himself.

Steps to Reproduce

There are no real steps to reproduce this, as this must be preceded by an internal error on the provider's side when initiating the transfer that was requested by a consumer. This is therefore a subsequent error.

Context Information

Additional Findings

Occurred in two independent transfer terminations on the provider side:

  1. Provider: Internal error because Dataplane could not be found -> Provider termination -> Consumer stays in Requested
  2. Provider: Internal error when processing the HttpDataSource -> Provider termination -> Consumer stays in Requested

Hypothesis / Possible Root Cause

  • SO hints that bug could be cause by an issue in the state machine.
    • State requested has ID 500, no relation to HTTP
    • Error 500 - could also be state machine
    • It is one of our EDCs - we can start debugging

Timeline / Priority

We want to fix this issue before the launch of MDS 2.1 so that on-premise customers of MDS do not have to do a second update at a later stage; this will ensure a smoother customer experience.

Screenshots

As a follow up in the EDC-UI of the consumer, the transfer process will continue to rotate bars indefinitely. Like here for a transfer, that was started 20 days ago but meanwhile terminated by the Provider. image

Workaround idea

As a workaround, the Consumer could consider the transfer to have been terminated after a certain amount of time x if he does not make any status changes for a certain period of time on that Transferprocess.

Stakeholders

@ip312 @jkbquabeck

tmberthold avatar Jan 09 '24 07:01 tmberthold

@SebastianOpriel @AbdullahMuk Who can work on the workaround and how much time will the workaround take?

jkbquabeck avatar Apr 11 '24 11:04 jkbquabeck

@ununhexium ?

SebastianOpriel avatar Apr 11 '24 11:04 SebastianOpriel

@ununhexium any status update here?

SebastianOpriel avatar Oct 06 '24 08:10 SebastianOpriel

@SebastianOpriel Nothing new, I didn't touch this issue for months.

ununhexium avatar Oct 07 '24 11:10 ununhexium