monorepo
monorepo copied to clipboard
ETA calculation update
Background
We need to significantly improve the accuracy of our estimated times to provide users with a clear countdown. Achieving a 99% accurate estimation, whether the path is slow or fast, within 20-30 seconds is crucial.
Current data of the accuracy of our estimated times [NEED TO ADD]
The new estimation time should be calculated based on:
- [As is] The current liquidity level of the router on the directional chain.
- Past 20 transfer statistics (#5966) for actual time durations.
- Router activity calls, including: a. A list of recently initiated transfers and their associated router addresses. b. The duration of inactivity for the router.
Other ideas:
- Transfer size
- Transfers initiated in the past 3-4 days on currently active routers only
- The available router's liquidity must be adjusted by the volume of transfers currently in processing
- Difference between estimated and actual time on this route (past 3-4 days)
- Dummy variable - actual errors
Product Spec: https://www.notion.so/connext/1-ETA-f434c48e3054455da73dc3a560875c32?pvs=4#b1f5b5132d2b41c3aa6b54706dda9924
Linked Issues & Documentation
Connected to #5966 and #5736
Current
This is what our estimates currently include.
1) Query idle router liquidity
Current idle liquidity of the asset across all routers on the destination chain is the liquidity that should be available for use in fast path. Idle liquidity is the amount deposited - amount removed - amount in flight. Notice this already subtracts in flight liquidity (liquidity that is currently unavailable because it was used to boost a previous transfer and the funds have yet to reconcile for the router).
Multiple routers can provide liquidity for a single transfer - this is limited to 3 by the network right now. So the max available router liquidity at any time is the idle liquidity of the top 3 routers.
We query our DB for this idle liquidity which refreshes its view every 15 seconds. This can be improved by reading from another indexing layer like a subgraph or reading from a node directly via RPC call. These options will introduce other tradeoffs like cost, uptime, and speed of the estimate being returned.
2) Factor in router availability
Routers can provide idle liquidity but experience downtime. When routers are down, we have an incomplete view of the useable idle liquidity that can be used for fast path. We currently use the following logic as a proxy for router availability, overriding a "fast path" estimate if a majority of the recent transfers have actually been slow.
- If estimated latency is "fast path", check the status of the last N=20 transfers within the last 3 hours
- If >50% of these transfers were
completedSlow, then display a "slow path" estimate instead
3) Display time ranges
Along with the fast/slow determination, we also provide a median and a range for fast/slow paths for the last N=20 transfers of each type.
- If estimated latency is "fast path", then display the median times for the last N fast transfers rounded up to minute precision
- If estimated latency is "slow path", then display the median times for the last N slow transfers rounded up to hour and minute precision
Improvements to consider
- Exclude idle liquidity of routers that haven't boosted any transfers in the last X days from 1)
- For Chimera, there should be router telemetry endpoints where down routers can more easily be identified
- Specify a tighter range of historical transfers to look at for 2) and 3). e.g. exclude transfers that are 20% more/less than the
amountbeing estimated for (right now we look at last N transfers of any amount) - Express estimates with confidence intervals based on historical data. e.g. there is a 95% chance this transfer will take M minutes