status-go icon indicating copy to clipboard operation
status-go copied to clipboard

Initial synchronization in wallet takes lots of rpc requests (~ 5,000 for a wallet with ~200 transfers)

Open dshulyak opened this issue 5 years ago • 12 comments

Problem

We have two distinct synchronization methods in a wallet.

  1. eth_getLogs query for erc20 transfers
  2. balance checks for regular eth transfers

Erc20 synchronization is relatively optimal. It makes 2 getLogs (inbound and outbound erc20 transfers) request per range of blocks, 100,000 current value. And then couple of request per each found log (to get tx and receipt data for each log).

Synchronization for eth transfers requires us to scan whole blocks space. We do it with balances comparison, e.g. we compare balances between first and last blocks, if balances are not equal we split blocks space in half and do another comparison rounds for both of those spaces. Once balances are equal we consider that all transfers in a given sub-space are found and terminate process for that sub-space.

The problem is that this approach requires lots of balances rpc requests (~4,000 from 5,000). Infura recently enabled rate limits (https://infura.io/pricing). Wallet will be unusable if there will be a spike of users importing existing wallets (e.g. on a release day),

Implementation

There are several options:

  1. ethereum rpc allows batching requests, if it works on infura it will reduce balances checks at least in half
  2. use several infura accounts in the app and select one randomly (or deterministically for a given account)
  3. negotiate custom rate limits with infura for our account
  4. fallback to etherscan for getting initial transfers list
  5. suggest user to provide their own remote rpc url

Infura and etherscan are both centralized services, so 4 option might not seem as a terrible idea (and this is what we had previously) but it might be worse than infura in terms of reliability and from code perspective.

It is probably not viable option to expect users to configure their rpc, so i will dismiss 5th option as a complete solution.

1st is a good improvement anyway and should be made, but needs to be verified that it works on infura.

Using several infura accounts will give us required throughput, whatever we expect, but it will cost $ or we will have to create multiple free trier services. For example to reach a target of the 10,000 synced users a day we will need 50,000,000 requests a day, thats 10 Growth infura tiers (1000$ a month).

Acceptance Criteria

Wallet should stay functional while ~10,000 users are trying to import existing accounts with an infura backend.

cc @yenda @flexsurfer

dshulyak avatar Aug 08 '19 11:08 dshulyak

Can we maybe resolve this from the UX perspective? Why do we need to have the full history for all transactions immediately? Maybe we can have like the transfers from the last 7 days or last 10?

adambabik avatar Aug 12 '19 09:08 adambabik

We will have to give an option to download all of them, right? So if such option will exist then we kinda betting that only part of users will sync their transactions

dshulyak avatar Aug 12 '19 11:08 dshulyak

We will have to give an option to download all of them, right?

I think it should be provided but does not have to be a default option. If we do provide it, maybe we can rate limit it on our side as well and inform about the time needed to complete and current progress. I personally would not be interested in my transactions a year back so I would skip it.

The idea number 4 seems nice + showing a prompt asking if user is fine with that. If not we can suggest providing a custom infura RPC endpoint. If answer is negative again, use our own infura access and inform about the progress.

(2) seems like cheating but maybe they are ok with that :) (3) is similar, in both cases we should talk to them directly.

What I would definitely recommend is having at least two infura accounts. One for mission critical stuff like sending transactions or getting new blocks. Another one for less important stuff like transactions history.

adambabik avatar Aug 12 '19 19:08 adambabik

i would agree with Adam, for now as a quick solution we could just don't load transaction history, only show transactions were made in status after account creation, and maybe we could add options in the UI "retrieve history for 7d", "14d", "1m". or smth

flexsurfer avatar Aug 13 '19 06:08 flexsurfer

Sorry, had to refresh the page. What @andrey said:)

Can we maybe resolve this from the UX perspective? Why do we need to have the full history for all transactions immediately? Maybe we can have like the transfers from the last 7 days or last 10?

If it helps, I defintiely see that working from a design perspective. We can reuse the 'fetch messages' pattern. It doesn't resolve the issue as a whole as I understand DApps also make request we don't control.

hesterbruikman avatar Aug 13 '19 08:08 hesterbruikman

from this designer's perspective, loading the entire history at once is kinda nuts, it's missing the perspective that you might not care about those at all as a user. it's better to divide it into more digestible pieces of say 20 items and add a Fetch more button (or load as you scroll if the delay isn't that long) at the very end to load the next piece.

errorists avatar Aug 13 '19 08:08 errorists

@adambabik @flexsurfer is it possible that you guys could map out the places where we need to make Infura calls, and which are mission critical vs. not? This can be done next week or in Istanbul, but I want to have it on your radar as a pre-v1 task.

Before we negotiate rates with them, I'd like to have an understanding of which specific actions require calls, and a rough estimate of how many calls we need per day for each.

rachelhamlin avatar Sep 03 '19 09:09 rachelhamlin

two other options for pulling info is thegraph.com and aleth.io. I'm currently building go-bindings for aleth.io (https://github.com/corpetty/go-alethio-api)

corpetty avatar Sep 03 '19 15:09 corpetty

Me and @jakubgs will have a call today at 3 PM EST with them to discuss how this affects us. I'd assume that because I don't have hard estimates on how much we use, I'll just figure out their stance and how we can work through things with them, and report back.

corpetty avatar Sep 03 '19 15:09 corpetty

Ok, so the plan looks like this:

  • identify all Infura calls and group them into mission critical and others
  • figure out whether we want to have a second Infura account for non-critical methods vs use another provider

adambabik avatar Sep 03 '19 15:09 adambabik

I agree with @flexsurfer, the simplest fix is to not download history at all and just keep what user does through the app themselves. We can improve on that in the future. For no we could just have a button that opens etherscan or something like that if someone needs to browse that.

Also, we should probably create a separate Infura project and use that for our PR builds so the metrics generated mostly by manual testing are separate from ones from normal usage.

jakubgs avatar Sep 03 '19 19:09 jakubgs

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

status-github-bot[bot] avatar Aug 05 '21 16:08 status-github-bot[bot]