bdk icon indicating copy to clipboard operation
bdk copied to clipboard

`wallet_esplora_async` returns error 429 while scanning

Open danielabrozzoni opened this issue 2 years ago • 13 comments
trafficstars

Describe the bug
Running wallet_esplora_async we panic due to blockstream.info replying with error code 429 to us (too many requests)

To Reproduce
cargo run in bdk/example-crates/wallet_esplora_async produces the following:

Scanning keychain [External] 1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16  17  18  19  20  21  22  23  24 Error: Reqwest(reqwest::Error { kind: Status(429), url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("blockstream.info")), port: None, path: "/testnet/api/scripthash/aaaf78825de8d6cce8fc0865688238e8169f64967f102c10db8112a806ff9713/txs", query: None, fragment: None } })

Build environment

  • BDK tag/commit: fda8c754538eb078419a1f924fe908fe3532e8a3

danielabrozzoni avatar Sep 12 '23 14:09 danielabrozzoni

Last month, Aug 2023, I was playing with BDK alpha-1 and I had this problem with blockstream, see https://github.com/realeinherjar/sweepr/issues/11.

A safe solution is to add a 250ms delay between as suggested in https://github.com/Blockstream/esplora/issues/449.

Additionally, a candidate solution for the async clients could be tower::Service

From: https://github.com/seanmonstar/reqwest/issues/491#issuecomment-1475753864

You can use tower::ServiceBuilder to add rate-limiting to any request-like functionality and returns a Future.

realeinherjar avatar Sep 12 '23 14:09 realeinherjar

Why do we panic? I'd expect it to return an error.

@realeinherjar I guess this solution won't work for the blocking client which can also trigger this. How long does it take after getting 429 before you can make another request? It seems like not long from testing so I think we can safely just handle the 429 by slowing down when we receive it. Note it seems like that the async esplora code is somehow much faster than the blocking one so it hits the limit with a lower number of parallel requests. I'm not sure why, maybe reqwest is better at maintaining idle connections. I've checked that it isn't because of http2 using tracing-subscriber (it isn't using it). I wonder why we aren't using http2 here does esplora not support it?

But in conclusion I think we should just slow down on 429. Sleep for a little and decrement the number of parallel requests we're doing. This should be done on both the blocking and non-blocking versions.

LLFourn avatar Sep 13 '23 00:09 LLFourn

But in conclusion I think we should just slow down on 429

I don't like to hardcode wait time, if I have a personal Esplora server that I can DoS with API calls I should be able to do so.

Hence, we should expose the API with an extra argument delay that takes a usize in miliseconds to wait between [async] API calls. We could even add a "default value" of 250ms with Option<usize> and unwrap_or(250).

Here's where it happens in blocking: https://github.com/bitcoindevkit/bdk/blob/59fc1b341ba94212b2ea3e888e51fb6fcd00589f/crates/esplora/src/blocking_ext.rs#L213-L239

and async https://github.com/bitcoindevkit/bdk/blob/59fc1b341ba94212b2ea3e888e51fb6fcd00589f/crates/esplora/src/async_ext.rs#L225-L249

realeinherjar avatar Sep 13 '23 16:09 realeinherjar

The specs for 429 HTTP code suggests including a Retry-After header. If esplora has this, we can use it, otherwise we can have a decaying delay.

evanlinjin avatar Sep 15 '23 02:09 evanlinjin

Im seeing a 429 as well using the swift-bdk library.

Esplora(message: "Esplora client error: HttpResponse(429)")

I assume this is the same issue as above?

Im simply doing this:

        let esploraConfig = EsploraConfig(baseUrl: "https://blockstream.info/testnet/api", proxy: nil, concurrency: 1, stopGap: 10, timeout: nil)
        
        let blockchainConfig = BlockchainConfig.esplora(config: esploraConfig)
        
        do {
            
            let blockchain = try Blockchain(config: blockchainConfig)
            let desc = try Descriptor(descriptor: descriptor, network: Network.testnet)
            let wallet = try Wallet(descriptor: desc, changeDescriptor: nil, network: Network.testnet, databaseConfig: db)
            
            self.syncWallet = SyncWallet(db: db, blockchain: blockchain, wallet: wallet)

ismyhc avatar Sep 28 '23 20:09 ismyhc

Yes, especially when using blockstream....

realeinherjar avatar Sep 29 '23 20:09 realeinherjar

Will https://github.com/bitcoindevkit/rust-esplora-client/pull/58 make it easier to handle these in a more graceful way or do we need full parsing of the returned messages as proposed in https://github.com/bitcoindevkit/rust-esplora-client/issues/47?

notmandatory avatar Oct 01 '23 22:10 notmandatory

I plan to work on this as well. I like @evanlinjin's idea of:

Retry-After header <...> otherwise we can have a decaying delay.

realeinherjar avatar Oct 01 '23 22:10 realeinherjar

The specs for 429 HTTP code suggests including a Retry-After header. If esplora has this, we can use it, otherwise we can have a decaying delay.

No Retry-After header:

Error: HttpResponse { status: 429, message: "Test", headers: ["server", "date", "content-type", "content-length", "access-control-allow-origin", "via", "alt-svc"] }

Ignore the "Test". It was a hack

realeinherjar avatar Oct 08 '23 10:10 realeinherjar

Yep, this is preventing me from launching on mainnet, let me know how I can assist :)

chrisguida avatar Nov 20 '23 23:11 chrisguida

Yep, this is preventing me from launching on mainnet, let me know how I can assist :)

One thing that I was struggling was to get the time to retry-after in the 429 response. I tried reading all headers both in ureq and reqwest but I couldn't find it. Another thing was to implement a timeout that does not add a bunch of deps (like tokio) and does not break WASM. Finally, exit conditions. I was thinking on having a max_tries with 5 as the default value.

realeinherjar avatar Nov 21 '23 08:11 realeinherjar

Have we tried just reducing the number of parallel requests on 429? I think this could fix it while still approaching the optimal speed.

LLFourn avatar Nov 22 '23 01:11 LLFourn

I tested reducing concurrency to 1 thread a few months ago and still got the 429. Thinking an exponential backoff might be better?

chrisguida avatar Nov 30 '23 22:11 chrisguida

FYI

LWK is just waiting 2^(#times_it_returns_429-1) and it reliably scan on rating limited publics server such as blockstream.info and mempool.space https://github.com/Blockstream/lwk/blob/0533774f8d9b3fef2d579ad5b9bdd497ffa74165/lwk_wollet/src/clients/esplora_wasm_client.rs#L511

RCasatta avatar Sep 05 '24 14:09 RCasatta

fixed by bitcoindevkit/rust-esplora-client#98

notmandatory avatar Sep 24 '24 23:09 notmandatory

Still need to publish 0.10.0 version of rust-esplora-client and then update bdk_esplora to use it.

notmandatory avatar Sep 25 '24 02:09 notmandatory