chainlink
chainlink copied to clipboard
Upkeep States: optimize calls to DB to scan for workIDs
AUTO-7399
Description
As part of load testing, it was found that while scanning the DB for state of workIDs we hit the DB too much which is causing pressure and multiple failures, which leads to a worst case where we lookup a massive amount of workIDs multiple times.
Changes
- Added rate limiting for workIDs, where we avoid checking a single workID more than a rate of 3 times / 1 min
- specialized implementation of token buckets with low (as possible) memory footprint
Testing
- unit tests for ensuring functionality of the token buckets
I see that you haven't updated any README files. Would it make sense to do so?
Not really coding related, but what do the DB queries that we're executing look like? Do we have indexes created against workID, or whatever we're querying by? Or would that be too expensive in terms of storage?
@ferglor There are 2 tables that gets queried:
- evm logs (log poller) where we call
poller.IndexedLogs
ofDedupKeyAdded
event with the given workIDs (should be indexed). called fromscanner.go
- evm upkeep states where we query by workID (indexed). called from
orm.go
@amirylm looking at the migrations, I can only see two indexes for evm.logs, one for tx_hash
, and then another for evm_chain_id, block_timestamp
. SelectIndexedLogs
seems to query based on evm_chain_id, address, event_sig, topics, and block_number. So it looks like we could improve query performance but I'm not sure at what cost/what the process is for introducing new indexes like this 🤷
@amirylm looking at the migrations, I can only see two indexes for evm.logs, one for
tx_hash
, and then another forevm_chain_id, block_timestamp
.SelectIndexedLogs
seems to query based on evm_chain_id, address, event_sig, topics, and block_number. So it looks like we could improve query performance but I'm not sure at what cost/what the process is for introducing new indexes like this 🤷
We have a lot of indexes on that table, maybe you're not seeing them because it was recently renamed from evm_logs
to evm.logs
?
evm | evm_logs_idx | index | dominovaldano | logs
evm | evm_logs_idx_created_at | index | dominovaldano | logs
evm | evm_logs_idx_data_word_four | index | dominovaldano | logs
evm | evm_logs_idx_data_word_one | index | dominovaldano | logs
evm | evm_logs_idx_data_word_three | index | dominovaldano | logs
evm | evm_logs_idx_data_word_two | index | dominovaldano | logs
evm | evm_logs_idx_topic_four | index | dominovaldano | logs
evm | evm_logs_idx_topic_three | index | dominovaldano | logs
evm | evm_logs_idx_topic_two | index | dominovaldano | logs
evm | evm_logs_idx_tx_hash | index | dominovaldano | logs
evm | idx_evm_log_poller_blocks_order_by_block | index | dominovaldano | log_poller_blocks
evm | idx_evm_logs_ordered_by_block_and_created_at | index | dominovaldano | logs
This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.