`eth_getFilterChanges` returns `"filter not found"`
Checklist
- [X] This is not a security-related bug/issue. If it is, please follow please follow the security policy.
- [X] I have searched on the issue tracker and the lotus forum, and there is no existing related issue or discussion.
- [X] I am running the
Latest release, the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these. - [X] I did not make any code changes to lotus.
Lotus component
- [ ] lotus daemon - chain sync
- [ ] lotus fvm/fevm - Lotus FVM and FEVM interactions
- [ ] lotus miner/worker - sealing
- [ ] lotus miner - proving(WindowPoSt/WinningPoSt)
- [X] lotus JSON-RPC API
- [ ] lotus message management (mpool)
- [ ] Other
Lotus Version
lotus deployed to glif
Repro Steps
$ curl https://api.node.glif.io/rpc/v0 -d'{"jsonrpc":"2.0","id":1,"method":"eth_newFilter","params":[{"topics":["0x2e84339036b9caef6da03565dd37a42d041d8af759ccfddc01625856146ce473"],"addresses":["0x811765acce724cd5582984cb35f5de02d587ca12"]}]}'
{"jsonrpc":"2.0","result":"0x43baae26e5514378adc824ca03b261c100000000000000000000000000000000","id":1}
$ sleep 10 # `sleep 0` and `sleep 5` also don't work
$ curl https://api.node.glif.io/rpc/v0 -d'{"jsonrpc":"2.0","id":1,"method":"eth_getFilterChanges","params":["0x43baae26e5514378adc824ca03b261c100000000000000000000000000000000"]}'
{"jsonrpc":"2.0","id":1,"error":{"code":1,"message":"filter not found"}}
Describe the Bug
After upgrading to ethers@6, it's now failing to subscribe to events. See repro steps above. It responds with "filter not found" although the id returned from eth_newFilter was used.
Logging Information
This was on glif. Same results on chain.love.
I tried reproducing locally, but failed on this:
{"jsonrpc":"2.0","id":1,"error":{"code":-32601,"message":"method 'eth_newFilter' not found"}}
I did already set EnableEthRPC = true
For anyone else having this issue, https://github.com/filecoin-station/on-contract-event/tree/main is a temporary workaround
Thanks to @dumikau for finding this code path in lotus-gateway, which is most likely the problem. When connecting to lotus directly, everything works as expected.
/* FILTERS: Those are stateful.. figure out how to properly either bind them to users, or time out? */
func (gw *Node) EthGetFilterChanges(ctx context.Context, id ethtypes.EthFilterID) (*ethtypes.EthFilterResult, error) {
if err := gw.limit(ctx, stateRateLimitTokens); err != nil {
return nil, err
}
ft := statefulCallFromContext(ctx)
ft.lk.Lock()
_, ok := ft.userFilters[id]
ft.lk.Unlock()
if !ok {
return nil, filter.ErrFilterNotFound
}
return gw.target.EthGetFilterChanges(ctx, id)
}
https://github.com/filecoin-project/lotus/blob/1b2dde1e65b030975714e06fd792161e7b55a979/gateway/handler.go#L89-L96
Every HTTP request gets its own new statefulCallTracker, which is apparently by design and intended only for websocket connections: https://github.com/filecoin-project/lotus/blob/1b2dde1e65b030975714e06fd792161e7b55a979/gateway/proxy_eth.go#L647-L648
The problem is that filters are long-lived inside a Lotus node and it's perfectly valid to do this via non-websocket requests.
It seems to me that the desire here is to partition the filter and subscription space per-user, but that's not really possible to achieve with the way this all works.
However, filter IDs are generated via UUIDv4, so we have some guarantees about uniqueness and guess-ability already. I'm not sure what other leakage we would try and protect against in a public gateway. So, we could either share a statefulCallTracker across all requests, or just do away with it entirely since it just proxies to the original calls which do essentially the same map look-up operation.
@magik6k am I missing something from 22231dc34f and 1286d76988? Is there a reason I'm missing that we can't just pass these through without checking?
FWIW, it's easy to configure Ethers v6 ethers.JsonRpcProvider to use the old polling-based approach that uses the well-supported RPC method eth_getLogs:
const provider = new ethers.JsonRpcProvider(fetchRequest, undefined, {
polling: true
})
IMO the action item here is to remove the stateful call tracker from this call path and just pass it through to the node; I don't see a good reason it's gated.
Looking at this again; the tracking was originally introduced in https://github.com/filecoin-project/lotus/pull/9863, and then extended in https://github.com/filecoin-project/lotus/pull/10027 to cover subscribe.
userFiltersis only used to track the number of filters applied per connection.EthMaxFiltersPerConnis fixed to16, and when the number of filters reaches this number for a particular connection then they'll be rejected.userSubscriptionsis only used to track the number ofSubscribecalls and also check it againstEthMaxFiltersPerConn.
It seems to me that the desire here is to partition the filter and subscription space per-user
My original comment from above is wrong. The purpose of these checks is to limit the number of filters installed on a lotus node for each "user", which is an appropriate thing for a gateway to do because of the cost of having active filters.
This works find when using websockets, but we currently don't have any per-IP tracking, and even if we did we'd have to deal with people using reverse proxies in front of lotus-gateway (like glif does). We're then in the realm of deciding whether to accept X-Forwarded-For or not (fine if you have a reverse proxy, dangerous if you don't). We can't give cookies because people are using this from curl or libraries that don't support cookies (making an assumption here about ethers).
It seems like glif doesn't expose websockets, but api.chain.love does, so this ~works (at leas it doesn't error, I don't know an address to use to get something more active):
import { ethers } from 'ethers'
const provider = new ethers.WebSocketProvider('wss://api.chain.love/rpc/v1')
console.log('provider:', provider)
const filterId = await provider.send('eth_newFilter', [{
address: ['0x811765acce724cd5582984cb35f5de02d587ca12'],
topics: []
}])
console.log('filterId:', filterId)
provider.on('block', async() => {
const logs = await provider.send('eth_getFilterChanges', [filterId])
console.log('logs:', logs)
})
I think that we might be forced to just block these stateful API endpoints from HTTP like suggested in https://github.com/filecoin-project/lotus/issues/11153 unless we want to go down the rabbit hole of per-IP tracking. We could also be encouraging public API providers to offer websockets option.
I'd really like to know how this is handled in Ethereum-land. How do public providers offer this normally?
- https://docs.infura.io/api/networks/ethereum/json-rpc-methods/filter-methods/eth_newblockfilter - Infura gates via API key and only allows a filter to live for 15 minutes
- https://www.quicknode.com/docs/ethereum/eth_newFilter QuickNode (probably) gates by API, has a per-call credit cost thing and limits the block range depending on your plan
- https://docs.chainstack.com/reference/arbitrum-newfilter Arbitrum deletes the filter if not polled after a certain period of time
I was thinking that something like the Arbitrum option gets us around the limit problems with this. We get rid of the per-connection limit entirely but setup a liveness check in the gateway that will automatically remove the filter from the lotus node if it's not polled after a certain period of time.
I wouldn't mind offering more options for public API providers, but this is something we could evolve over time. And already now they have the option of excluding these APIs from what they offer with a reverse proxy and they could even do API key gating too.
liveness check in the gateway that will automatically remove the filter from the lotus node if it's not polled after a certain period of time.
Alas we already have that with FilterTTL in the lotus node itself, which defaults to 24 hours. We probably want to document that this should be reduced dramatically for multi-tenant nodes.
After some discussion on Slack I think that the way forward here is to:
- Block http as per https://github.com/filecoin-project/lotus/issues/11153
- See if we can make the stateful tracker a little more intelligent - have it call to lotus to remove state when the client ends
- Maybe do some documentation work for public node providers on
FilterTTLandMaxFilters
This should be resolved in https://github.com/filecoin-project/lotus/pull/12327
Closing as completed as this should be resolved in https://github.com/filecoin-project/lotus/pull/12327, which has been shipped in Lotus v1.29.0 which most RPC-providers has updated to now. Please reopen if you still encounter this issue @juliangruber