atomicDEX-API icon indicating copy to clipboard operation
atomicDEX-API copied to clipboard

feat(utxo): prioritize electrum connections

Open rozhkovdmitrii opened this issue 1 year ago • 5 comments

This PR refactors the electrum part of the RPC clients. And also introduces a new feature related to connection management decisions.

Within the electrum enable request, one could now specify two parameters (min_connected & max_connected).

  • min_connected: KDF needs to keep at least min_connected working active (maintained) connections at all times.
  • max_connected: KDF needs to keep at most max_connected working active (maintained) connections at any time.
  • To achieve the selective/single-server effect set these parameters to (1, 1)
  • To achieve the multiple/all-servers effect, set these parameters to (1, len(servers)) (or don't set at all, as this is used as the default)
    • The connection manager will initially try to connect to a maximum of len(servers) connections and keep them maintained. As some connections get lost for whatever reason, they will not be retried right away. Retries only take place:
      • every 10 minutes (should we decrease?)
      • if we fall below min_connected, which is 1 in this case.
    • So one could use (10, 68) for instance to make sure we are connected to at least 10 servers at all times, and if we ever fall below 10 we fire another round of reconnections. After this round we are guaranteed to have <= 68 connections.
      • e.g. we have 10 connections now, we lose 1 and have 9 now. A round of reconnections start and we end up with some number of active connections between 9 (assuming only our 9 connected servers were the non-faulty ones) and 68. If after the reconnections round we still fall below 10=min_connected connections, we keep retrying.
  • These parameters must satisfy: 0 < min_connected <= max_connected ~<= len(servers)~

The connection manager maintains these maintained (working active) connections and these are the ones used for any electrum request (they are all used concurrently and whichever returns the success result first is considered the response for this request).

For requests directed for specific servers (like server_version, get_block_count_from, etc...), the connections to these servers are established first if they are not in the maintained servers set and then the request is performed. So even if the server isn't maintained/active, it could still be queried.

The servers order in the electrum enable request encode an implicit priority (previously in this PR, was encoded as primary & secondary). Servers that appear first in the list are assumed to be more robust and thus have higher priority compared to servers after.

Since we do an altruistic round of retries every 10 minutes, if we ever had the case were a high priority server was down and thus wasn't maintained by us and now is back online, we will prefer maintaining it over another already-maintained low priority server (this is assuming the maintained servers set is full, i.e. len(maintained server) == min_connected). Simply put, high priority servers will kick out low priority ones when they come back online if we have no room to fit both.

Addresses #791

rozhkovdmitrii avatar Sep 11 '23 11:09 rozhkovdmitrii

related https://github.com/KomodoPlatform/komodo-defi-framework/pull/2013#discussion_r1415489660 The below is from a DM with @onur-ozkan

The problem is with resubscription where we do it if any electrum reconnects, so if an electrum at the end of the list reconnects we send subscription request again to the first electrum. We might have subscriptions to multiple electrums unintentionally and get multiple notifications for the same balance change.

shamardy avatar Dec 05 '23 12:12 shamardy

related #2013 (comment) The below is from a DM with @onur-ozkan

The problem is with resubscription where we do it if any electrum reconnects, so if an electrum at the end of the list reconnects we send subscription request again to the first electrum. We might have subscriptions to multiple electrums unintentionally and get multiple notifications for the same balance change.

The problem is that when one Electrum server reconnects, it triggers the subscription process again, even if it's not necessary or there's already an active subscription. This can lead to unintentionally subscribing to multiple servers and getting the same notifications multiple times for the same change in balance. To fix this, I've added logics to keep track of subscriptions more carefully, so we only subscribe to what we need. This way, we can avoid getting too many notifications for the same thing.

cc. @shamardy @onur-ozkan

borngraced avatar Feb 06 '24 15:02 borngraced

@borngraced Could you please review my last commit. It's a straight refactor of the two manager: basically moving the methods around and putting methods in the structs they access.

I still think the selective manager will need a logical refactor though. This refactor hasn't touched on this.

mariocynicys avatar Feb 19 '24 08:02 mariocynicys

@borngraced Could you please review my last commit. It's a straight refactor of the two manager: basically moving the methods around and putting methods in the structs they access.

I still think the selective manager will need a logical refactor though. This refactor hasn't touched on this.

Thank you for the great work! I've careful reviewed the refactoring and it looks good and better! please can you move the consts in common to the top just after import statements @mariocynicys

borngraced avatar Feb 20 '24 06:02 borngraced

@mariocynicys there seems to be a problem in electrum connections in wasm where the wasm tests that use electrums are failing https://github.com/KomodoPlatform/komodo-defi-framework/actions/runs/9521530304/job/26525040016#step:8:706

shamardy avatar Jun 21 '24 16:06 shamardy

@cipig I think you should start testing changes in this PR for the electrum changes. Without changing the requests / responses for activating coins it should work as before but a bit optimized :) @mariocynicys please correct me if I am wrong related to the multiple connections mode. It should work as before without changing anything in coin activations, right?

shamardy avatar Oct 03 '24 11:10 shamardy

i used this on a market maker node at home... internet got disconnected and reconnected after 10 minutes or so... all electrum connections were reestablished and my_balance worked (it didn't before this change)

also built Desktop version with this... and found something... some coins couldn't be enabled (electrum server down or ssl certs expired), but mm2 still tries to talk to them every 5 seconds, "on getting balance" and "on scripthash_get_history"

· 2024-10-05 13:58:21 +0000 [tx_history coin=BLOCX] utxo_common:3238] Error utxo_common:218] Transport("JsonRpcError { client_info: \"coin: BLOCX\", request: JsonRpcBatchRequest([JsonRpcRequest { jsonrpc: \"2.0\", id: 137, method: \"blockchain.scripthash.get_balance\", params: [String(\"225b1c0df4720e3a80d7515a22dd0e1d7072502739f179867656f7411b1285cb\")] }, JsonRpcRequest { jsonrpc: \"2.0\", id: 138, method: \"blockchain.scripthash.get_balance\", params: [String(\"282d79e745d6372db531c9a111837887a42cefc72f3ff2b137506932c4f7eb94\")] }]), error: Internal(\"All servers errored: [(electrum1.blocx.live:50002, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\\\\\\\")\\\")), (electrum2.blocx.live:50002, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\\\\\\\")\\\")), (electrum3.blocx.live:50002, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\\\\\\\")\\\"))]\") }") on getting balance
· 2024-10-05 13:58:22 +0000 [tx_history coin=BLOCX] utxo_common:3263] utxo_common:3499] Error All servers errored: [(electrum1.blocx.live:50002, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\")")), (electrum2.blocx.live:50002, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\")")), (electrum3.blocx.live:50002, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\")"))] on scripthash_get_history, retrying
· 2024-10-05 13:58:23 +0000 [tx_history coin=AIPG] utxo_common:3238] Error utxo_common:218] Transport("JsonRpcError { client_info: \"coin: AIPG\", request: JsonRpcBatchRequest([JsonRpcRequest { jsonrpc: \"2.0\", id: 137, method: \"blockchain.scripthash.get_balance\", params: [String(\"225b1c0df4720e3a80d7515a22dd0e1d7072502739f179867656f7411b1285cb\")] }, JsonRpcRequest { jsonrpc: \"2.0\", id: 138, method: \"blockchain.scripthash.get_balance\", params: [String(\"282d79e745d6372db531c9a111837887a42cefc72f3ff2b137506932c4f7eb94\")] }]), error: Internal(\"All servers errored: [(electrumx1.aipowergrid.io:50002, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(UnknownIssuer) }\\\\\\\")\\\")), (electrumx2.aipowergrid.io:50002, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(UnknownIssuer) }\\\\\\\")\\\"))]\") }") on getting balance
· 2024-10-05 13:58:24 +0000 [tx_history coin=AIPG] utxo_common:3263] utxo_common:3499] Error All servers errored: [(electrumx1.aipowergrid.io:50002, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(UnknownIssuer) }\")")), (electrumx2.aipowergrid.io:50002, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(UnknownIssuer) }\")"))] on scripthash_get_history, retrying
· 2024-10-05 13:58:27 +0000 [tx_history coin=VPRM] utxo_common:3238] Error utxo_common:218] Transport("JsonRpcError { client_info: \"coin: VPRM\", request: JsonRpcBatchRequest([JsonRpcRequest { jsonrpc: \"2.0\", id: 150, method: \"blockchain.scripthash.get_balance\", params: [String(\"225b1c0df4720e3a80d7515a22dd0e1d7072502739f179867656f7411b1285cb\")] }, JsonRpcRequest { jsonrpc: \"2.0\", id: 151, method: \"blockchain.scripthash.get_balance\", params: [String(\"282d79e745d6372db531c9a111837887a42cefc72f3ff2b137506932c4f7eb94\")] }]), error: Internal(\"All servers errored: [(electrumx1.vaporumcoin.us:50001, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Os { code: 111, kind: ConnectionRefused, message: \\\\\\\\\\\\\\\"Connection refused\\\\\\\\\\\\\\\" }\\\\\\\")\\\"))]\") }") on getting balance
· 2024-10-05 13:58:27 +0000 [tx_history coin=VPRM] utxo_common:3263] utxo_common:3499] Error All servers errored: [(electrumx1.vaporumcoin.us:50001, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Os { code: 111, kind: ConnectionRefused, message: \\\"Connection refused\\\" }\")"))] on scripthash_get_history, retrying
· 2024-10-05 13:58:32 +0000 [tx_history coin=BLOCX] utxo_common:3238] Error utxo_common:218] Transport("JsonRpcError { client_info: \"coin: BLOCX\", request: JsonRpcBatchRequest([JsonRpcRequest { jsonrpc: \"2.0\", id: 141, method: \"blockchain.scripthash.get_balance\", params: [String(\"225b1c0df4720e3a80d7515a22dd0e1d7072502739f179867656f7411b1285cb\")] }, JsonRpcRequest { jsonrpc: \"2.0\", id: 142, method: \"blockchain.scripthash.get_balance\", params: [String(\"282d79e745d6372db531c9a111837887a42cefc72f3ff2b137506932c4f7eb94\")] }]), error: Internal(\"All servers errored: [(electrum1.blocx.live:50002, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\\\\\\\")\\\")), (electrum2.blocx.live:50002, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\\\\\\\")\\\")), (electrum3.blocx.live:50002, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\\\\\\\")\\\"))]\") }") on getting balance
· 2024-10-05 13:58:33 +0000 [tx_history coin=BLOCX] utxo_common:3263] utxo_common:3499] Error All servers errored: [(electrum1.blocx.live:50002, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\")")), (electrum2.blocx.live:50002, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\")")), (electrum3.blocx.live:50002, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(Expired) }\")"))] on scripthash_get_history, retrying
· 2024-10-05 13:58:34 +0000 [tx_history coin=AIPG] utxo_common:3238] Error utxo_common:218] Transport("JsonRpcError { client_info: \"coin: AIPG\", request: JsonRpcBatchRequest([JsonRpcRequest { jsonrpc: \"2.0\", id: 141, method: \"blockchain.scripthash.get_balance\", params: [String(\"225b1c0df4720e3a80d7515a22dd0e1d7072502739f179867656f7411b1285cb\")] }, JsonRpcRequest { jsonrpc: \"2.0\", id: 142, method: \"blockchain.scripthash.get_balance\", params: [String(\"282d79e745d6372db531c9a111837887a42cefc72f3ff2b137506932c4f7eb94\")] }]), error: Internal(\"All servers errored: [(electrumx1.aipowergrid.io:50002, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(UnknownIssuer) }\\\\\\\")\\\")), (electrumx2.aipowergrid.io:50002, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(UnknownIssuer) }\\\\\\\")\\\"))]\") }") on getting balance
· 2024-10-05 13:58:35 +0000 [tx_history coin=AIPG] utxo_common:3263] utxo_common:3499] Error All servers errored: [(electrumx1.aipowergrid.io:50002, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(UnknownIssuer) }\")")), (electrumx2.aipowergrid.io:50002, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Custom { kind: InvalidData, error: InvalidCertificate(UnknownIssuer) }\")"))] on scripthash_get_history, retrying
· 2024-10-05 13:58:37 +0000 [tx_history coin=VPRM] utxo_common:3238] Error utxo_common:218] Transport("JsonRpcError { client_info: \"coin: VPRM\", request: JsonRpcBatchRequest([JsonRpcRequest { jsonrpc: \"2.0\", id: 153, method: \"blockchain.scripthash.get_balance\", params: [String(\"225b1c0df4720e3a80d7515a22dd0e1d7072502739f179867656f7411b1285cb\")] }, JsonRpcRequest { jsonrpc: \"2.0\", id: 154, method: \"blockchain.scripthash.get_balance\", params: [String(\"282d79e745d6372db531c9a111837887a42cefc72f3ff2b137506932c4f7eb94\")] }]), error: Internal(\"All servers errored: [(electrumx1.vaporumcoin.us:50001, Transport(\\\"Failed to establish connection: Temporary(\\\\\\\"Couldn't connect to the electrum server: Os { code: 111, kind: ConnectionRefused, message: \\\\\\\\\\\\\\\"Connection refused\\\\\\\\\\\\\\\" }\\\\\\\")\\\"))]\") }") on getting balance
· 2024-10-05 13:58:37 +0000 [tx_history coin=VPRM] utxo_common:3263] utxo_common:3499] Error All servers errored: [(electrumx1.vaporumcoin.us:50001, Transport("Failed to establish connection: Temporary(\"Couldn't connect to the electrum server: Os { code: 111, kind: ConnectionRefused, message: \\\"Connection refused\\\" }\")"))] on scripthash_get_history, retrying

this 3 coins (VPRM, AIPG, BLOCX) don't show up on portfolio page, but mm2 still tries to talk to their electrums is this done intentionally (waiting for the electrums to come back/certs renewed), or is it a bug? if intentionally, then we should maybe increase the gap between requests to something > 5 seconds, else it could fill up the logs quickly

EDIT i see the requests are all from tx_history

cipig avatar Oct 05 '24 14:10 cipig

found something else in Desktop: on start, some coins are not enabled automatically, eg LBC and VRSC when i enable them manually later, i see this log:

[14:51:11] [debug] [kdf.service.cpp:1650] [151946]: error: bad answer json for enable/electrum details: {
    "error": "rpc:183] dispatcher_legacy:141] lp_commands_legacy:140] lp_coins:4587] Coin VRSC already initialized"
}

but the good thing is that the coin is then visible in Desktop and can be used

retried... it's always the same coins that are not enabled, VRSC and LBC... idk what the difference to other coins is... maybe timing related, maybe something else but more coins are affected, maybe 10 or so, i just don't know which those are since i have 280 coins enabled and need to find out which ones are missing :-)

cipig avatar Oct 05 '24 15:10 cipig

seeing this in logs on a mm node:

06 02:40:33, coins::utxo::rpc_clients::electrum_rpc::client:314] WARN Failed to send the request using active connections, trying all connections.
06 02:40:33, coins::utxo::rpc_clients::electrum_rpc::client:314] WARN Failed to send the request using active connections, trying all connections.
06 02:40:33, coins::utxo::rpc_clients::electrum_rpc::client:314] WARN Failed to send the request using active connections, trying all connections.

can you please add the name of the affected coin in this log? looks like it's VIA, since i get this error on my_balance

curl --insecure --url "http://127.0.0.1:7783" --data "{\"userpass\":\"$userpass\",\"method\":\"my_balance\",\"coin\":\"VIA\"}" && echo ""
{"error":"rpc:183] dispatcher_legacy:141] lp_commands_legacy:233] utxo_common:218] Transport: JsonRpcError { client_info: \"coin: VIA\", request: JsonRpcBatchRequest([JsonRpcRequest { jsonrpc: \"2.0\", id: 6956, method: \"blockchain.scripthash.get_balance\", params: [String(\"c52556248f479dc03f2ade37cbfbd61b0f40fd59bb0c8eede899bc8833ba30fb\")] }, JsonRpcRequest { jsonrpc: \"2.0\", id: 6957, method: \"blockchain.scripthash.get_balance\", params: [String(\"a66de336dcbb06d57247552cb75aca20caf38ec1ef8fca9b894f24c477637e37\")] }]), error: Internal(\"All servers errored: []\") }"}

even though the electrumx is fine... it's just one though (curl --url "http://127.0.0.1:7783" --data "{\"method\":\"electrum\",\"coin\":\"VIA\",\"servers\":[{\"url\":\"electrum3.cipig.net:10073\"}],\"userpass\":\"$userpass\"}") when i disable VIA and enable it back with same electrum, it works fine

same happened with PIVX... also just one electrum and my_balance shows error... disable and enable fixes the problem

cipig avatar Oct 06 '24 02:10 cipig

are the changes in adex_cli required?

im adding the min & max connected there just in case if we re-use it again the thing works as expected.

mariocynicys avatar Oct 18 '24 13:10 mariocynicys