Timeout from setupNetworks is not propagated
I am setting up polkadot network and overriding ws timeout with the following script:
import { test } from "bun:test";
import { setupNetworks } from '@acala-network/chopsticks-testing'
test('increased time-out', async() => {
const {polkadot} = await setupNetworks({
polkadot: {
endpoint: 'ws://localhost:9944',
port: 8000,
timeout: 300_000_000,
},
});
let sysEntries = await polkadot.api.query.system.account.entries()
for (const [k,v] of sysEntries) {
console.log("key: ", k)
}
})
After that I am trying to fetch System.Account entries and get a
2025-03-25 17:57:46 RPC-CORE: queryStorageAt(keys: Vec<StorageKey>, at?: BlockHash): Vec<StorageChangeSet>:: -32603: Internal Error: No response received from RPC endpoint in 60s
chopsticks' LOG_LEVEL=trace shows additional information
[17:57:46.723] ERROR (ws): Error handling request: 'Error: No response received from RPC endpoint in 60s
at __internal__timeoutHandlers (/Users/ndk/parity/ahm-dryrun/node_modules/@polkadot/rpc-provider/ws/index.js:503:42)'
app: "chopsticks"
I checked @polkadot/rpc-provider/ws implementation and 60s is the default value.
I double checked and the timeout that I overwrote was actually applied:
console.log(polkadot.ws);
output:
WsProvider {
...
__internal__timeout: 300000000,
...
}
I started digging deeper and got to the point where provider for fetching storage is created - link. It does not use the value that I set in setupNetworks config.
How can I increase RPC timeout for fetching storage? It fails systematically so maybe it'll be better to implement an exponential back-off or similar retrying technique? @xlc @ermalkaleci
Firstly, the timeout is to the chopsticks instance, not to the upstream RPC provider.
Secondly, I will say this is a XY problem
Increase the timeout is unlikely going to help. The root cause is most likely that Chopsticks is making too many RPC requests the node can handle and it ends up being unresponsive. The real root cause is that the Substrate node shouldn't be unresponsive at all. If it is truly overloaded and cannot handle the request, it should respond some error code or reject the connection. I opened an issue here https://github.com/paritytech/polkadot-sdk/issues/8035
On the Chopsticks side, we could try to reduce the batch size and maybe that can help a bit https://github.com/AcalaNetwork/chopsticks/blob/462fdc458f424e4cb68aeeb42825cc79299f9ca8/packages/chopsticks/src/utils/fetch-storages.ts#L17
Can you reduce the batch size and see if it can avoid the RPC timeout issue?
@xlc I think it is exceeding response size
I forked chopsticks repo and manually linked it to my script inside package.json. However, when I try to run my script with the new chopsticks, the compiler keeps throwing errors or keep being unable to find modules, e.g.
error: error: Cannot find module '@acala-network/chopsticks-executor' from '/Users/ndk/parity/x3c41a/chopsticks/packages/core/src/wasm-executor/node-wasm-executor.js'
at emitError (node:worker_threads:205:13)
Note that I have a pretty limited experience with js/ts. @ermalkaleci have you checked batch sizes or could you, please, help me with that if you ever did this before?
you need to run yarn build-wasm and maybe yarn build
you can also just add (or modify an existing) unit test in this repo to run some testing code
I tried reducing BATCH_SIZE and fetching the keys - same problem.
I tried tweaking local node and adding flags from the most recent changes, see - https://github.com/paritytech/polkadot-sdk/pull/7994. Still getting the same error, none of these helped:
- adding
--rpc-rate-limit 12345678 100000000for both omni-node and regular polkadot node. Note: if you don't specify--rpc-rate-limit Xthe node doesn't enable any rate-limiting at all. Commands for reference below:
./target/release/polkadot-omni-node --chain /Users/ndk/parity/polkadot-sdk/cumulus/polkadot-parachain/chain-specs/asset-hub-polkadot.json --sync "warp" --database rocksdb --blocks-pruning 600 --state-pruning 600 --no-hardware-benchmarks --rpc-max-request-size 100000000 --rpc-max-response-size 100000000 --rpc-port 9945 --rpc-rate-limit 12345678 -- --sync "warp" --database rocksdb --blocks-pruning 600 --state-pruning 600 --no-hardware-benchmarks --rpc-max-request-size 100000000 --rpc-max-response-size 100000000 --rpc-port 9944 --rpc-rate-limit 12345678
polkadot --sync warp --state-pruning 1000 --blocks-pruning 1000 --rpc-rate-limit 12345678 --tmp --rpc-port 9944 --rpc-cors all
- adding
--rpc-message-buffer-capacity-per-connection 4294967295to polkadot node - I also ran BATCH_SIZE of size 1000 (current one), 400, 200 and 100 against all the cases mentioned above and none of them worked.
@xlc do you have any other idea how RPC connection issue might be fixed. I was still seeing responses of lenght 1000 even when I changed BATCH_SIZE but I believe unchanged response was due to pageSize.
https://github.com/AcalaNetwork/chopsticks/pull/898 this should help
will try to reproduce the issue locally and see what's exactly is causing the problem
@xlc did you have time to reproduce it?
this is what I did and all working good without any issues:
run polkadot node in polkadot-sdk repo
cargo run --release -p polkadot -- --chain=polkadot --sync warp --no-hardware-benchmarks --rpc-port 9944 --rpc-max-response-size 100000000
fetch storage:
yarn start fetch-storages '0x' --db polkadot.sqlite --endpoint ws://localhost:9944 --block 25323705
I am getting a 2GB db
await polkadot.api.query.system.account.entries() this code is reading all the accounts into memory and that is going to take forever regardless.
I am running this code and it is working fine
onst { polkadot } = await setupNetworks({
polkadot: {
endpoint: 'ws://localhost:9944',
block: 25323705,
db: 'polkadot.sqlite',
port: 8000,
'build-block-mode': BuildBlockMode.Manual,
}
})
console.log('setup')
let startKey = '0x'
while (true) {
const sysEntries = await polkadot.api.query.system.account.entriesPaged({ pageSize: 1000, args: [], startKey })
for (const [k] of sysEntries) {
console.log("key: ", k.toHuman())
startKey = k.toHex()
}
if (sysEntries.length < 1000) {
break
}
}
I cannot reproduce any timeout issue
dumb question:
why is this going to take forverer?
await polkadot.api.query.system.account.entries() this code is reading all the accounts into memory and that is going to take forever regardless.
I did the same with the smaller dataset of rcAccounts and it worked perfectly fine
just because there are a lot of accounts on polkadot