go-ethereum icon indicating copy to clipboard operation
go-ethereum copied to clipboard

ethclient: SubscribeFilterLogs() not respecting FilterQuery.FromBlock

Open mcdee opened this issue 7 years ago • 20 comments

System information

Geth Version: 1.6.7-stable Git Commit: ab5646c532292b51e319f290afccf6a44f874372 Architecture: amd64 Protocol Versions: [63 62] Network Id: 1 Go Version: go1.8.1 Operating System: linux

Expected behaviour

Setting up a simple log filter for a given contract address starting at block 1000000:

client, _ := ethclient.Dial("/home/foo/.ethereum/geth.ipc")

filter := ethereum.FilterQuery{}
filter.Addresses = make([]common.Address, 0)
filter.Addresses = append(filter.Addresses, common.HexToAddress("0x..."))
filter.FromBlock = big.NewInt(1000000)

Using this with filterLogs() provides a set of results:

logs, _ := client.FilterLogs(ctx, filter)
fmt.Println(len(logs)) // Ouptuts '143'

However using SubscribeFilterLogs() does not provide any immediate results:

ctx := context.Background()
ch := make(chan types.Log)
client.SubscribeFilterLogs(ctx, filter, ch)

for true {
    log := <-ch
    fmt.Println("Matching log encountered")
}

It is expected that SubscribeFilterLogs() start from the block as specified in the FilterQuery's FromBlock.

Actual behaviour

SubscribeFilterLogs() appears to start from the first new block received by the client regardless of the block as specified in the FilterQuery's FromBlock.

Steps to reproduce the behaviour

The above code provides the scenario to test (pick your favourite contract address and start block for testing purposes).

mcdee avatar Aug 31 '17 13:08 mcdee

The behaviour is correct.

Reason:

client.filterLogs function invokes eth_getLogs function ultimately, which traverses the whole database to retrieve all matched logs.

client.SubscribeFilterLogs function creates a subscription with specific args. Once a new log arrives, if it matches the filter criteria,the log will been returned via subscription, if not, it will been discarded. It will not traverse the past log.

rjl493456442 avatar Sep 04 '17 08:09 rjl493456442

This is not at all obvious from the documentation.

It also begs the question: how would one obtain all log entries in the correct order? FilterLogs() followed by SubscribeFilterLogs() could miss log entries in new blocks between the calls.

The only thing that I can think of doing would be to SubscribeFilterLogs(), then FilterLogs(), work through the log entries returned by FilterLogs(), then start working through SubscribeFilterLogs() but that's hardly intuitive and requires handling duplicates. Is there a better way?

mcdee avatar Sep 04 '17 08:09 mcdee

first call FilterLogs ,write result to the channel which is for SubscribeFilterLogs for example:

logs, err := c.caller.FilterLogs(ctx, q)
	if err != nil {
		return nil, nil, err
	}
	var ch = make(chan types.Log, len(logs))
	//subcribe logs that will happen.
	sub, err := c.caller.SubscribeFilterLogs(ctx, q, ch)
	for _, log := range logs {
		ch <- *log
	}
	return ch, sub, err

nkbai avatar Sep 20 '17 07:09 nkbai

@nkbai what happens if a block comes in between the calls to FilterLogs and SubscribeFilterLogs?

mcdee avatar Sep 20 '17 07:09 mcdee

i don't know , you may lost this block. you can listen pending event SubscribePendingTransactions.

nkbai avatar Sep 20 '17 08:09 nkbai

you can specify the query's toblock argument to rpc.PendingBlockNumber. if FilterLogs doesn't need too much time, it will get all the related blocks.

nkbai avatar Sep 20 '17 08:09 nkbai

It doesn't sound like there is a working solution to this at current, which is a shame given that reading logs is a very common requirement.

mcdee avatar Sep 20 '17 08:09 mcdee

This is a valid feature request, and we're aware of the limitation. It would also be nice to support ToBlock in SubscribeFilterLogs for streaming queries.

fjl avatar Sep 21 '17 19:09 fjl

Hi has there been any movement regarding implementing this feature? Support for fromBlock in SubscribeFilterLogs would be especially helpful for clients that rely on logs to update a local DB based on state changes in a contract such that if the client stops and restarts at a later point in time, the client can start receiving logs starting from the last block that it was online.

yondonfu avatar Jul 26 '18 21:07 yondonfu

Hello, any updates on this feature request?

kminevskiy avatar Feb 01 '19 21:02 kminevskiy

I struggled with the same issue, perhaps documentation could provide a good example on how to handle this

qustavo avatar Apr 17 '19 07:04 qustavo

Waiting for #20012

adamschmideg avatar Aug 27 '19 08:08 adamschmideg

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Aug 29 '20 08:08 stale[bot]

Here's a library for fetching a continuous view (to workaround this issue): https://github.com/pascaldekloe/etherstream

pascaldekloe avatar Jul 18 '22 16:07 pascaldekloe

Did this ever get solved? the above workaround would not solve the issue if you're doing a large query.

JackG-eth avatar Jan 09 '23 16:01 JackG-eth

@s1na and I have re-evaluated this last week to see how much it would take to implement.

fjl avatar Jan 09 '23 17:01 fjl

@s1na Do you have any idea on when this might get implemented? I'm super super keen to use this functionality.

JackG-eth avatar Jan 12 '23 15:01 JackG-eth

@JackG-eth Unfortunately I can't make an estimate. I'll update here when I get started on it.

s1na avatar Jan 12 '23 17:01 s1na

Thank you :)

JackG-eth avatar Jan 16 '23 10:01 JackG-eth

@s1na I could really use this methodology, could you push me in the direction of where i need to make the changes as I'd be happy to implement it myself (re your evaluation)

  • From my understanding the work flow for FilterLogs is:
  1. Arguments
  2. CallContext on the client passing in (eth_getLogs) as the method
  3. Create an RPC call with these arguments
  4. Umarshal results into struct and return
  • From my understanding the work flow for SubscribeFilterLogs is:
  1. Arguments
  2. EthSubscribeon the client passing in (Logs) as the method
  3. Create an RPC call with these arguments
  4. Return new events to a subscription
  5. Notifications are sent for current events and not for past events. For use cases that cannot afford to miss any notifications, subscriptions are probably not the best option.
  6. I'm assuming the change is in subscription.go somewhere
  7. I think I've narrowed it down to the forward() function which receieves RPC notifcations but im not sure

Any help/advice would be appreciated!

JackG-eth avatar Feb 02 '23 17:02 JackG-eth

For anyone interested, this is how web3.js does the trick: call eth_getLogs and then eth_subscribe.

But they also don't handle the edge case mentioned by @mcdee

chochihim avatar Mar 02 '23 03:03 chochihim

I hit this issue recently as well. It's not at all intuitive when the SubscribeFilterLogs takes the same ethereum.FilterQuery type as the FilterLogs call, in spite of the FromBlock being ignored. I assume that the BlockHash and ToBlock parameters are ignored as well though it seems that the Addresses and Topics filter parameters are respected.

I think at minimum there should be a change to the SubscribeFilterLogs method to take a different type as input (could be named e.g. ethereum.FilterSubscription or ethereum.SubscribeFilterQuery etc) which does not include the ignored fields. If that is not possible for compatibility reasons, then the docs should make this quirk ABUNTANTLY CLEAR!

For now my solution is to have two concurrent processes, one which calls subscribe and the other which slowly polls FilterLogs. Both processes write their results to a queue which is then deduplicated downstream. I think this should avoid dropping logs while providing a quick detection speed for the majority of the logs.

ryanc414 avatar Mar 29 '23 15:03 ryanc414

Another thing I would be curious to understand is what is the behaviour of the SubscribeFilterLogs method during chain reorgs? Will we get the logs from the new canonical chain all at once, or could some logs be missed?

ryanc414 avatar Mar 29 '23 15:03 ryanc414

See the Removed field from https://pkg.go.dev/github.com/ethereum/go-ethereum/core/types#Log @ryanc414.

pascaldekloe avatar Mar 30 '23 13:03 pascaldekloe

See the Removed field from https://pkg.go.dev/github.com/ethereum/go-ethereum/core/types#Log @ryanc414.

OK great thanks. So if I only want events that are in the main canonical chain I should drop events with Removed is true?

ryanc414 avatar Mar 30 '23 16:03 ryanc414

Not really! Depending on your application, you may need to apply some logic to process removed logs.

For example, if you were tracking balances of a token by summing up transfer events, you'd need to subtract the amounts contained in removed logs.

fjl avatar Mar 30 '23 20:03 fjl

Not really! Depending on your application, you may need to apply some logic to process removed logs.

For example, if you were tracking balances of a token by summing up transfer events, you'd need to subtract the amounts contained in removed logs.

Got it thanks. In my case I am just using certain events as a trigger to invalidate cached values. I'm also assuming that we cannot rely on logs always being received strictly in order.

ryanc414 avatar Apr 03 '23 10:04 ryanc414

Posting some thoughts about this. We want to maintain order in the logs, so we should push all historical logs first then go into live-mode. In order to do this we should "inch closer" to the head:

  1. First do historical until current head block num n.
  2. Check if n has been reorged: issue rmLogs and new logs for reorged section
  3. By the time that's finished, head has moved n2.
  4. Do another historical from n to n2.
  5. Check if n2 has been reorged: issue rmLogs and new logs for reorged section
  6. repeat until nN and current head are same
  7. then toggle live mode

these first 4 steps need to be run in a separate routine than EventSystem.eventLoop otherwise other events would be blocked.

s1na avatar Apr 18 '23 15:04 s1na

Hi, has work started on this feature?

I started running into this problem, but my use case relies that the events are ordered and deduplicated, which can make a workaround quite complex for a standard recovery scenario. I fear handling this via a workaround may cause unforeseen bugs given the problems I mentioned.

Abso1ut3Zer0 avatar Jul 14 '23 19:07 Abso1ut3Zer0

Hi, has work started on this feature?

I started running into this problem, but my use case relies that the events are ordered and deduplicated, which can make a workaround quite complex for a standard recovery scenario. I fear handling this via a workaround may cause unforeseen bugs given the problems I mentioned.

You can follow the progress at https://github.com/ethereum/go-ethereum/pull/27439

s1na avatar Jul 14 '23 19:07 s1na