hydra icon indicating copy to clipboard operation
hydra copied to clipboard

Close transaction dropped from cardano-node

Open ch1bo opened this issue 2 years ago • 6 comments

Context & versions

At least 0.12.0

Steps to reproduce

  1. Open a head on preprod
  2. Submit the Close websocket command
  3. See a transaction added to the cardano-node mempool
  4. Sometimes the transaction gets removed (upon seeing the next block) without it being included.

Actual behavior

The head is not getting closed and the Cardano network just dropped our transaction for this. No user feedback is given.

Expected behavior

The Cardano network to not drop our transaction. Or at least the hydra client is made aware of this (after some time).

Hypothesis

The transaction is dropped due to the invalidAfter validity range exceeded on the closeTx.

ch1bo avatar Aug 23 '23 15:08 ch1bo

Some grooming notes:

  • Explore if there is a way of detecting a tx dropped from a mempool (or current tx state)
    • When the txs get dropped is there any log output?
    • Is there any endpoint that we can query to know the status of a tx?
    • Is there any configuration for cardano-node that controls when items get dropped from a mempool? Mempool size? Number of elements?

Note: There is:

 `cardano-cli query tx-mempool --mainnet info/next-tx/tx-exists myTxId` 
  • Potentially we could query to detect when a tx is in the mempool and when it get's dropped so we can re-send it.
  • How would we test this?

v0d1ch avatar Sep 12 '23 12:09 v0d1ch

We can observe the Mempool using a specialised mini-protocol we could implement of client for, but this is somewhat involved. We could do something simpler using timeouts in the HeadLogic: When you request a Close, use a Wait to have an upper bound on how long you're waiting for observing the OnCloseTx from the chain?

ghost avatar Sep 19 '23 13:09 ghost

When you request a Close, use a Wait to have an upper bound on how long you're waiting for observing the OnCloseTx from the chain?

You mean some retrying logic in the HeadLogic?


I think we should just adjust the upper validity range to be something more compatible with the network. This is the code which determines that upper bound: https://github.com/input-output-hk/hydra/blob/11dc6aeb68b909e070fc6eb366b4734e090b62c1/hydra-node/src/Hydra/Chain/Direct/Handlers.hs#L354-L358 Looking at that, I wonder whether this is really the issue we encounter? 200 seconds is long enough for mainnet. But then again, the contestation period is configurable and the default of 60 seconds might not be long enough (if a block is not produced within 1 minute?) https://github.com/input-output-hk/hydra/blob/11dc6aeb68b909e070fc6eb366b4734e090b62c1/hydra-node/src/Hydra/Options.hs#L605-L606

We should validate whether it's due to upper validity bounds first.

ch1bo avatar Sep 26 '23 15:09 ch1bo

It would be great to keep track of the block hash/height when this happens in order to validate the invalid upper bound hypothesis. In the meantime, I would like to suggest we address this issue by documenting this possible behaviour and let client applications (eg. hydra-tui or any frontend apps interacting with a hydra-node) decide what to do based on their perception of time.

ghost avatar Nov 17 '23 08:11 ghost

I think this appeared in this smoke test: https://github.com/input-output-hk/hydra/actions/runs/7088575057

ch1bo avatar Dec 04 '23 16:12 ch1bo

Happened to me today while closing a head with team: Tx got successfully posted to the cardano-node but it was later on removed from mempool. There's no trace in the node telling why this happened.

abailly avatar May 08 '24 08:05 abailly

The Cardano After Dark team encountered this issue this week, FYI

rjharmon avatar Feb 19 '25 21:02 rjharmon

I think I have understood that slot battles can lead to mempool transactions being included in a block that gets reverted, and then is not automatically re-added. I'm curious if the same condition could occur on other head transactions.

It makes me think the node could benefit from a generational mempool, where transactions it believes are in a block are moved to the 'tentatively confirmed' pool {tx, blockId} or similar, and then if the block gets undone, any of those txs could be moved back to the regular mempool. With a tidying task to keep the old generations size-constrained. Of course that would be a node thing, not a Hydra thing...

As for the Hydra agent, it could benefit by keeping track of the expected Closing utxo and continue polling every ~10-15s for its presence for perhaps 2-3 minutes (even if it gets an early indication of presence).

Would there be a mechanism available for a client to get the Close tx CBOR from Hydra server? This way a third-party agent with its own poll/retry capabilities ✋ could take on the retries.

rjharmon avatar Feb 19 '25 21:02 rjharmon

Would there be a mechanism available for a client to get the Close tx CBOR from Hydra server?

Not right now. While that would not be too hard to add, a third-party agent could be poll/retrying with the current API. Just send Close commands until the head is closed - nothing bad can happen from it, you would just see errors if a close is already submitted and pending for inclusion on the block chain.

ch1bo avatar Feb 24 '25 15:02 ch1bo