near-api-js
near-api-js copied to clipboard
RPC Redundancy/Failover Configuration
Is your feature request related to a problem? Please describe. Current RPC providers can have downtime, temporary connectivity issues, or rate limits that make clients transactions fail. Over the past year we have observed several windows of RPC failure, which could have been mitigated if near-api-js had configurations for multiple RPC providers.
Describe the solution you'd like Similar to issues #703 and #717, the core JSON RPC provider should be refactored to not only have retries, but allow for retries against multiple Providers. The default provider list can configure both near foundation and openshards RPC services. Provider list can be a single node string for backward compatibility OR an array of node strings. Up for discussion but needed: Create a failover threshold of retries for each provider, and a threshold for provider failures before defaulting to a different priority.
Configuration Example:
RPC_MAINNET_PROVIDERS="https://rpc.mainnet.near.org,[https://mainnet-rpc.openshards.io](https://mainnet-rpc.openshards.io)"
RPC_GUILDNET_PROVIDERS="https://guildnet-rpc.openshards.io,[https://rpc.guildnet.near.org](https://rpc.guildnet.near.org)"
NOTE: Because near-api-js is used in many dapps and repos, this functionality is very key toward providing the easiest way to allow clients to decentralize their RPC access. This is critical for community attacks against public resources.
Describe alternatives you've considered Users must create multiple instances of the Near module with different providers configured and detect TXN failures. Not idea at all.
Additional context There is an ongoing effort to create a decentralized RPC for mainnet & guildnet using many of the openshards.io nodes with a redundant load-balancer.
I believe our strategy for the decentralization of RPC Servers is a bit different, but some of these ideas can be implemented. On near-api-js
level we can provide support for multiple servers and fallback logic, but it will add an additional level of complexity and source of petensioal bugs.
@frol @MaximusHaximus , any thoughts?
Similar suggestion from @artob: https://github.com/near/near-api-js/issues/735
@volovyk-s What is on your mind in terms of a different strategy? I feel near-api-js is the right abstraction layer to deal with the pool of RPC servers to enable true decentralization.
@frol there are two separate problems. The first one is a stability of a single RPC Server. The second one is the ability to switch to another server when the first one is down (decentralization). As far as I know, our current strategy was to work on stability first (API Keys). For the second one, I agree, usage of multiple RPC Servers with fallbacks on a clientside is the best option. But we will need to design it carefully, simple fallback on each call can be slow. And we will need to support API Keys for each such server. I will prioritize this issue.
@volovyk-s Ah, well, those are indeed two completely different efforts, but they came somewhat together, and we need both solutions: (1) extended RPC connection configuration, (2) failover configuration. This issue is about the second point.
@volovyk-s Ah, well, those are indeed two completely different efforts, but they came somewhat together, and we need both solutions: (1) extended RPC connection configuration, (2) failover configuration. This issue is about the second point.
(and @volovyk-s)
I apologize on the side discussion on decentralization here... It was just to mention the addition context and reasons.
The goal of this issue is to add support for multiple RPC configurations, allowing retries against a prioritized list of RPC nodes. This at least mitigates single RPC provider failures/outages. The decentralization should be handled by a very different setup than SDK. :)
@frol @volovyk-s any movement on this? Another downtime/major latency issue on mainnet, with many apps unusable because of the dependency of a single RPC provider.
Seems like the ideal solution here will be the creation of FailoverJsonRpcProvider
, which will make several simultaneous calls to all provided RPC URLs and return the result if, let's say, 50+% returns the same value. Or the first successful result if we want it to be snappy.
The problem here is the increased load on RPC Servers, something that we are trying to avoid.
Also, the code of near-api-js
is heavily coupled and relies on JsonRpcProvider
instead of Provider
interface. Usage of the new Provider
will lead to a ton of breaking changes.
@MaximusHaximus I think we should move the provider to a separate library in the future. People should be able to create their own implementations and use them in near-api-js-x
.
In our case, we will need to refactor the existing JsonRpcProvider
. And probably these calls and checks will be sequential. It will increase response time when the main RPC Server is down.
Also, we can refactor utils/web.ts
to achieve the same result (with similar downsides).
Enable configuring multiple RPC nodes also helps to resolve the feature request of switching RPC URLs in NEAR wallet: https://github.com/near/near-wallet-roadmap/issues/36, if wallet could set fallback RPC URLs by default.
I tried to add a new property node_urls
for Near.ts
. Tried polling different Connections
and found that none of the Connection
functions could return the status
of the node. Invoke status()
function still request retry 12 times rather than return a wrong status.
Trying to only modify the implementation of Near.ts
is wrong and modifying json-rpc-provider.ts
and utils/web.ts
are appropriate.
When executing fetchJson
, try to poll through the list of all rpc's. Find a connectable rpc and default it to be reliable. This avoids exponentialBackoff
to nodes that can never connect.
@frol @volovyk-s Yet another downtime/major latency issues on primarily on testnet, because of the dependency of a single RPC provider.
Dear Team,
We have a library named fallback-falooda which helps us get the reliable node in the list of nodes in our cosmos environment. We are also using it in our node selector in our near specific use cases.
I have made a few changes in the code to make it work with fallback falooda. here -
https://github.com/near/near-api-js/compare/master...leapsamvel:near-api-js:master
We can, either
- Take the approach of using fallback falooda in the library or
- Provide a getter method param and allow the user to provide the URL dynamic when the RPC call is made.
Could you let me know, and I will raise the PR accordingly with the test cases and documentation?
TIA.
Resolved by #1334