ethers.js icon indicating copy to clipboard operation
ethers.js copied to clipboard

Using provider.on with v6 errors as filters 'expire'

Open omnus opened this issue 1 year ago • 21 comments

Ethers Version

6.4.0

Search Terms

rpc provider.on

Describe the Problem

The implementation of event subscription with provider.on seems to have changed between v5 and v6. In v5 the requests to providers were for logs within a given range, in v6 we are calling eth_getFilterChanges for a saved filter definition.

This works well. For a while, but it looks like saved filters expire, and eventually you get this:

@TODO Error: could not coalesce error (error={ "code": -32000, "message": "filter not found" }, code=UNKNOWN_ERROR, version=6.4.0)
    at makeError (file:///Users/foo/Documents/GitHub/crowsnest/node_modules/ethers/lib.esm/utils/errors.js:116:21)
    at AlchemyProvider.getRpcError (file:///Users/foo/Documents/GitHub/crowsnest/node_modules/ethers/lib.esm/providers/provider-jsonrpc.js:628:16)
    at file:///Users/foo/Documents/GitHub/crowsnest/node_modules/ethers/lib.esm/providers/provider-jsonrpc.js:247:52
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5) {
  code: 'UNKNOWN_ERROR',
  error: { code: -32000, message: 'filter not found' }
}

An example of what this looks like on the provider side: Screenshot 2023-05-31 at 1 58 59 PM

Code Snippet

No response

Contract ABI

No response

Errors

No response

Environment

node.js (v12 or newer)

Environment (Other)

No response

omnus avatar May 31 '23 02:05 omnus

Note - restarting loads a new filter which sorts it.

Note 2 - I wonder if I am polling too infrequently / had comms issues to keep it alive...

omnus avatar May 31 '23 02:05 omnus

Interesting. It would be nice if on expiration it uses getLogs to backfill the missing time and resubscribe the filter.

ricmoo avatar May 31 '23 02:05 ricmoo

Interesting. It would be nice if on expiration it uses getLogs to backfill the missing time and resubscribe the filter.

That would be awesome! I am handling this scenario 'manually' with catch-up processing wrapped around the provider.on. If provider.on did it all itself!... 😃.

Update on the issue: I am now reasonably sure this was a comms issue on my side, which meant I dropped outside the five minute window to keep the filter 'alive'. That will teach me to go get lunch with it running locally...

omnus avatar May 31 '23 02:05 omnus

@ricmoo i think a lot of folks to that manually, so I definitely see value in baking that into the packaging and being able to configure it to one's use case :+1:

lostadam avatar Jun 13 '23 03:06 lostadam

Yeah, I'd rather just have events work forever until the provider is destroyed. One other note, I don't know if many have noticed in v6 is the ability to pause a provider (provider.pause = true or provider.setPaused(dropWhilePaused)), which can be used to pause the provider when a tab is in the background, or such. You can optionally decide whether to drop any events that happen while paused or have them play out once resumed. ;)

The recovery-after-expiry will exploit this logic. ;)

ricmoo avatar Jun 13 '23 03:06 ricmoo

Ah nice! I hadn't noticed the pause functionality, sounds useful.

My workaround for auto-recovery of an expired filter recently had a test in production (during the OP upgrade to bedrock) and worked, though it's probably not the most elegant solution... 😆.

omnus avatar Jun 13 '23 06:06 omnus

@ricmoo hey so not sure I followed your reply exactly. What's the proposed change and implementation plan? I personally thought the auto-recovery so that clients don't have to do whacky filter management was a wonderful idea - we have some very sticky code meant to handle that exact case, and I imagine it's quite common :sweat_smile:

lostadam avatar Jun 15 '23 12:06 lostadam

@lostarchitect So, for you, there is no change. :)

It was more of a note of how it is implemented internally. Since you can pause (which can disconnect) and then resume events, the logic can be reused for recovery, which is effectively like a forced pause/resume. :)

ricmoo avatar Jun 15 '23 15:06 ricmoo

I also got this error when I'm trying to listen event from my smart contract by this code:

const provider = new ethers.JsonRpcProvider(rpcUrl);
const contract = new Contract(contractAddress, abi, provider);
const filter = contract.filters.Deposited();
contract.on(filter, (evt) => console.log('evt', evt));

ethers v6.4.2 Do you guys have any suggestion to fix it?

hhphuc avatar Jun 16 '23 06:06 hhphuc

@ricmoo understood. So, to be clear, the autorecovery mechanism is not planned to be implemented? some node providers throw out filterIds quite often, and, as aforementioned, we have custom logic there, but would be nice to know if this were gonna be supported by the library itself :+1:

lostadam avatar Jun 19 '23 18:06 lostadam

Auto-recovery is planned and being worked on now.

It just needs to be tested on various backends which have different ways of “hanging up”.

But it is 100% planned and coming soon.

ricmoo avatar Jun 19 '23 18:06 ricmoo

@ricmoo you're the hero ts blockchain developers need but don't deserve

lostadam avatar Jun 19 '23 18:06 lostadam

@ricmoo any update here? :smile:

lostadam avatar Jul 14 '23 11:07 lostadam

I had the same error and it disappeared after I set the polling: true as per below this.Provider = new ethers.JsonRpcProvider(fr, undefined, { staticNetwork: ethers.Network.from(connInfo.ChainId), polling: true });

cdincoglu avatar Aug 26 '23 19:08 cdincoglu

I had the same error and it disappeared after I set the polling: true as per below this.Provider = new ethers.JsonRpcProvider(fr, undefined, { staticNetwork: ethers.Network.from(connInfo.ChainId), polling: true });

But I got a duplicate event problem ):

pangpangstudy avatar Jan 22 '24 07:01 pangpangstudy

@omnus, how did you manage to workaround the issue? I got no luck with provider.on() nor with try/catching around contract.on() and provider.on() My idea was to indeed catch for this error and resubscribe to the event, but I can't seem to catch it. If I can't get it to work that way I will probably implement a keep alive that would reset the filter if no event were received after a while. The other solution would be to manually re-implement the event subscription, consuming the eth_newFilter method and doing the polling myself.

AndreMiras avatar Mar 20 '24 09:03 AndreMiras

@AndreMiras dealing with the exact same thing right now as well. Also tried contract.on.catch(), which doesn't seem to be catching the error either.

kevinday avatar Mar 21 '24 20:03 kevinday

@ricmoo is auto-recovery still being worked on?

kevinday avatar Mar 21 '24 20:03 kevinday

Hi @AndreMiras and @kevinday,

⚠️ Warning ⚠️ this isn't the most elegant solution... 🤣

To handle this I override all logging, then look for filter not found errors and maintain a count of those. When I get over a threshold I exit the process. I am running this using pm2, which catches the process exit and spins me up a new process. That creates a new instance of the listener and I'm up and running again. You could use the same approach, but without the exit, instead attaching a new listener.

This is the code:

// Save the original console.log
const originalConsoleLog = console.log;
// Override console.log
console.log = function(...args) {
  // Call the original console.log with the arguments
  originalConsoleLog.apply(console, args);
  const logMessage = args.join(' ');
  if (logMessage.includes('filter not found')) {
    filterNotFoundErrorCount += 1;
    console.error("Filter not found: " + filterNotFoundErrorCount);
    if (filterNotFoundErrorCount > maxFilterErrorCount) {
      console.error("Filter not found count exceeded. Restarting....");
      process.exit(1); // This is fatal, we need to exit and restart
    }
  }
};

Like I said, I'm not sure this is the right answer (pretty confident it isn't lol), but I have been running with this for months and it's worked perfectly for my use case.

omnus avatar Mar 21 '24 22:03 omnus

Smart approach and indeed we could adapt to our flow and restart the filter rather than exiting. I didn't think about monkey patching, thanks for sharing!

On my side I took the keep alive approach. It basically checks if an event was received within keepAliveCheckTimeSeconds and if not it reset the listener.

// used for the keep alive, to make sure our event filter is still working
let lastEventTime = Date.now();
const keepAliveCheckTimeSeconds = 60;
const eventReceived = () => {
  lastEventTime = Date.now();
};
const initializeEventWatcher = async () => {
  await contract.on(eventName, eventHandler);
  console.log(`Listening ${eventName} events on ${poolAddress}...`);
};
// Recursive function to check the last event time and possibly reinitialize the watcher
const keepAliveCheck = async () => {
  if (Date.now() - lastEventTime > keepAliveCheckTimeSeconds * 1000) {
    console.error(
      `No events received for ${keepAliveCheckTimeSeconds} seconds, reinitializing the event watcher...`,
    );
    await contract.off(eventName);
    await initializeEventWatcher();
  }
  setTimeout(keepAliveCheck, keepAliveCheckTimeSeconds * 1000);
};
await initializeEventWatcher();
await keepAliveCheck();

AndreMiras avatar Mar 23 '24 16:03 AndreMiras

OK I found a better workaround using the debug listener.

const checkFilterNotFoundPattern = (obj: any) =>
  obj.action === "receiveRpcResult" &&
  obj.result.some(({ error }: { error: any }) =>
    /filter .* not found/.test(error?.message),
  );

const initializeEventWatcher = async () => {
  await contract.on(eventName, eventHandler);
  console.log(`Listening ${eventName} events on ${poolAddress}...`);
};

const resetEventWatcherOnError = async (obj: any) => {
  if (checkFilterNotFoundPattern(obj)) {
    await contract.off(eventName);
    await initializeEventWatcher();
  }
};

// resets the filter if it expires, refs: https://github.com/ethers-io/ethers.js/issues/4104
provider.on("debug", resetEventWatcherOnError);

I've also made a pull request so the provider.on("error") would work, see https://github.com/ethers-io/ethers.js/pull/4668

AndreMiras avatar Mar 28 '24 16:03 AndreMiras