firefly-ethconnect
firefly-ethconnect copied to clipboard
Missing events when performing catchup with `catchupModePageSize` > `catchupModeBlockGap`
This is problematic for FireFly, which subscribes from Block 0, and uses a PATCH
on startup to ensure the configuration of the event stream is always up to date.
[2022-02-04T02:57:47.779Z] INFO --> PATCH /eventstreams/es-2259ebbd-0f87-419d-4463-29876cde5be6
[2022-02-04T02:57:47.780Z] INFO es-2259ebbd-0f87-419d-4463-29876cde5be6: Update event stream
[2022-02-04T02:57:47.780Z] INFO es-2259ebbd-0f87-419d-4463-29876cde5be6: Notified of an ongoing stream update, existing batch processor
[2022-02-04T02:57:47.780Z] INFO es-2259ebbd-0f87-419d-4463-29876cde5be6: Notified of an ongoing stream update, existing event poller
[2022-02-04T02:57:47.780Z] INFO es-2259ebbd-0f87-419d-4463-29876cde5be6: Notified of an ongoing stream update, not waiting for new events
[2022-02-04T02:57:47.800Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): Uninstalled filter. ok=true (%!s(<nil>))
[2022-02-04T02:57:47.800Z] INFO <-- PATCH /eventstreams/es-2259ebbd-0f87-419d-4463-29876cde5be6 [200]
[2022-02-04T02:57:47.801Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): initial block height for event stream (from block): 0
[2022-02-04T02:57:47.807Z] INFO --> GET /subscriptions
[2022-02-04T02:57:47.807Z] INFO <-- GET /subscriptions [200]
[2022-02-04T02:57:47.819Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): catchup mode. Blocks 0 -> 4999
[2022-02-04T02:57:47.822Z] INFO WS/8d27544e-a7ce-460d-6442-0621118b27dc: Connected
[2022-02-04T02:57:47.891Z] INFO WS/16721966-36b1-45bc-48ff-4816badd5556: Connected
[2022-02-04T02:57:48.850Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): initial block height for event stream (from block): 0
[2022-02-04T02:57:48.872Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): catchup mode. Blocks 5000 -> 9999
[2022-02-04T02:57:49.911Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): initial block height for event stream (from block): 0
Then once it catches up, something weird happens after the first batch:
[2022-02-04T03:05:50.176Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): catchup mode. Blocks 435000 -> 439999
[2022-02-04T03:05:50.226Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): Dispatching event. Address=0x4D2752586bff01E1F598B71FA56e9AC8A11582D8 BlockNumber=436683 TxIndex=0x0
[2022-02-04T03:05:50.226Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): Dispatching event. Address=0x4D2752586bff01E1F598B71FA56e9AC8A11582D8 BlockNumber=436721 TxIndex=0x0
[2022-02-04T03:05:50.226Z] INFO es-2259ebbd-0f87-419d-4463-29876cde5be6: New batch length 1
[2022-02-04T03:05:50.226Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): Dispatching event. Address=0x4D2752586bff01E1F598B71FA56e9AC8A11582D8 BlockNumber=436764 TxIndex=0x0
[2022-02-04T03:05:50.226Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): Dispatching event. Address=0x4D2752586bff01E1F598B71FA56e9AC8A11582D8 BlockNumber=436835 TxIndex=0x0
[2022-02-04T03:05:50.226Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): Dispatching event. Address=0x4D2752586bff01E1F598B71FA56e9AC8A11582D8 BlockNumber=436916 TxIndex=0x0
[2022-02-04T03:05:50.226Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): Dispatching event. Address=0x4D2752586bff01E1F598B71FA56e9AC8A11582D8 BlockNumber=437041 TxIndex=0x0
[2022-02-04T03:05:50.727Z] INFO es-2259ebbd-0f87-419d-4463-29876cde5be6: Batch 4 initiated with 6 events. FirstBlock=436683 LastBlock=437041
[2022-02-04T03:05:50.727Z] WARN Cleared out suprious ack (could be from previous disonnect). err=FFEC100205: WebSocket '8d27544e-a7ce-460d-6442-0621118b27dc' closed
[2022-02-04T03:05:51.154Z] INFO Attempt batch 4 complete. ok=true
[2022-02-04T03:05:51.227Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): initial block height for event stream (from block): 0
[2022-02-04T03:05:51.263Z] INFO sb-e1f61740-2dc7-4e1f-68ad-f0ac55e4e663:BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[]): created filter from block 440000: 0xdde5b88258480118d701b810b2534400 - {Addresses:[0x4D2752586bff01E1F598B71FA56e9AC8A11582D8] Topics:[[0x805721bc246bccc732581be0c0aa2dd8f7ec93e97ba4b307be84428c98b0a12f]]}
See the second " initial block height for event stream (from block): 0" and then no further events.
There was an event it should have hit at block 437521
- which is after the 437041
event it dispatched as part of the gap fill, but before the 440000
where it set the filter.
This is affect v0.12.0 of FireFly, which shipped this change: https://github.com/hyperledger/firefly/commit/c07090310dffe024aa5ca193e51533df0988e027
It seems to be the case that as well as resetting, the event stream does not receive events after the initial batch that are found during the rewind.
Ok - found the root cause here.
It's subtle:
- The
catchupModePageSize
was set to be GREATER THAN thecatchupModeBlockGap
- This meant our logic at the point we caught up missed some events
- ~It also seems to have meant we didn't write a checkpoint, which is why we are resetting to
0
- so the original subject of this issue is probably erroneous~
Note I found we did still have a problem with writing incremental HWM checkpoint while doing gap based recovery: https://github.com/hyperledger/firefly-ethconnect/pull/199#issuecomment-1029662339