bdk
bdk copied to clipboard
`wallet_esplora_async` returns error 429 while scanning
Describe the bug
Running wallet_esplora_async we panic due to blockstream.info replying with error code 429 to us (too many requests)
To Reproduce
cargo run in bdk/example-crates/wallet_esplora_async produces the following:
Scanning keychain [External] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Error: Reqwest(reqwest::Error { kind: Status(429), url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("blockstream.info")), port: None, path: "/testnet/api/scripthash/aaaf78825de8d6cce8fc0865688238e8169f64967f102c10db8112a806ff9713/txs", query: None, fragment: None } })
Build environment
- BDK tag/commit: fda8c754538eb078419a1f924fe908fe3532e8a3
Last month, Aug 2023, I was playing with BDK alpha-1 and I had this problem with blockstream, see https://github.com/realeinherjar/sweepr/issues/11.
A safe solution is to add a 250ms delay between as suggested in https://github.com/Blockstream/esplora/issues/449.
Additionally, a candidate solution for the async clients could be tower::Service
From: https://github.com/seanmonstar/reqwest/issues/491#issuecomment-1475753864
You can use tower::ServiceBuilder to add rate-limiting to any request-like functionality and returns a Future.
Why do we panic? I'd expect it to return an error.
@realeinherjar I guess this solution won't work for the blocking client which can also trigger this. How long does it take after getting 429 before you can make another request? It seems like not long from testing so I think we can safely just handle the 429 by slowing down when we receive it. Note it seems like that the async esplora code is somehow much faster than the blocking one so it hits the limit with a lower number of parallel requests. I'm not sure why, maybe reqwest is better at maintaining idle connections. I've checked that it isn't because of http2 using tracing-subscriber (it isn't using it). I wonder why we aren't using http2 here does esplora not support it?
But in conclusion I think we should just slow down on 429. Sleep for a little and decrement the number of parallel requests we're doing. This should be done on both the blocking and non-blocking versions.
But in conclusion I think we should just slow down on
429
I don't like to hardcode wait time, if I have a personal Esplora server that I can DoS with API calls I should be able to do so.
Hence, we should expose the API with an extra argument delay that takes a usize in miliseconds to wait between [async] API calls.
We could even add a "default value" of 250ms with Option<usize> and unwrap_or(250).
Here's where it happens in blocking:
https://github.com/bitcoindevkit/bdk/blob/59fc1b341ba94212b2ea3e888e51fb6fcd00589f/crates/esplora/src/blocking_ext.rs#L213-L239
and async
https://github.com/bitcoindevkit/bdk/blob/59fc1b341ba94212b2ea3e888e51fb6fcd00589f/crates/esplora/src/async_ext.rs#L225-L249
The specs for 429 HTTP code suggests including a Retry-After header. If esplora has this, we can use it, otherwise we can have a decaying delay.
Im seeing a 429 as well using the swift-bdk library.
Esplora(message: "Esplora client error: HttpResponse(429)")
I assume this is the same issue as above?
Im simply doing this:
let esploraConfig = EsploraConfig(baseUrl: "https://blockstream.info/testnet/api", proxy: nil, concurrency: 1, stopGap: 10, timeout: nil)
let blockchainConfig = BlockchainConfig.esplora(config: esploraConfig)
do {
let blockchain = try Blockchain(config: blockchainConfig)
let desc = try Descriptor(descriptor: descriptor, network: Network.testnet)
let wallet = try Wallet(descriptor: desc, changeDescriptor: nil, network: Network.testnet, databaseConfig: db)
self.syncWallet = SyncWallet(db: db, blockchain: blockchain, wallet: wallet)
Yes, especially when using blockstream....
Will https://github.com/bitcoindevkit/rust-esplora-client/pull/58 make it easier to handle these in a more graceful way or do we need full parsing of the returned messages as proposed in https://github.com/bitcoindevkit/rust-esplora-client/issues/47?
I plan to work on this as well. I like @evanlinjin's idea of:
Retry-Afterheader <...> otherwise we can have a decaying delay.
The specs for
429HTTP code suggests including aRetry-Afterheader. If esplora has this, we can use it, otherwise we can have a decaying delay.
No Retry-After header:
Error: HttpResponse { status: 429, message: "Test", headers: ["server", "date", "content-type", "content-length", "access-control-allow-origin", "via", "alt-svc"] }
Ignore the "Test".
It was a hack
Yep, this is preventing me from launching on mainnet, let me know how I can assist :)
Yep, this is preventing me from launching on mainnet, let me know how I can assist :)
One thing that I was struggling was to get the time to retry-after in the 429 response.
I tried reading all headers both in ureq and reqwest but I couldn't find it.
Another thing was to implement a timeout that does not add a bunch of deps (like tokio) and does not break WASM.
Finally, exit conditions. I was thinking on having a max_tries with 5 as the default value.
Have we tried just reducing the number of parallel requests on 429? I think this could fix it while still approaching the optimal speed.
I tested reducing concurrency to 1 thread a few months ago and still got the 429. Thinking an exponential backoff might be better?
FYI
LWK is just waiting 2^(#times_it_returns_429-1) and it reliably scan on rating limited publics server such as blockstream.info and mempool.space https://github.com/Blockstream/lwk/blob/0533774f8d9b3fef2d579ad5b9bdd497ffa74165/lwk_wollet/src/clients/esplora_wasm_client.rs#L511
fixed by bitcoindevkit/rust-esplora-client#98
Still need to publish 0.10.0 version of rust-esplora-client and then update bdk_esplora to use it.