firefly
firefly copied to clipboard
Enable contract listeners with multiple filters
This PR adds the ability to listen to multiple types of events on the same contract listener, by adding an array of listeners, rather than a single event signature/location per listener. The old way of creating a listener is still accepted by the API, but it will always be returned in the filters array now. This is a migration concern that needs to be documented.
Open questions
One thing I'm not sure about here, is that this PR as-is removes the uniqueness constraint on listeners by topic/location/signature. It now allows multiples. I'm not sure if this is a problem or not. I can add that constraint back, but it would likely require some more sophisticated DB changes. Which brings me to the next point...
Right now all the filters for a contract listener get serialized to JSON and stored in a single column. I lated realized this means we lose the ability to query/filter (no pun intended) by signature, location, etc. which we used to do, in order to check for duplicates. I'm not sure if this is required or not, but wanted to call it out.
Example
Create contract listener request
{
"filters": [
{
"interface": {
"id": "aaa0e410-2b5b-4815-a80a-a18f2ae59f7d"
},
"eventPath": "BatchPin",
"location": {
"address": "0xb0cd60ade460e797e0c9d206290ac4ed45672c60"
}
}
],
"name": "CustomBatchPin",
"options": {
"firstEvent": "oldest"
},
"topic": "batch-pin"
}
Create contract listener response
{
"id": "acc0d227-1da4-4d0d-bbe0-0c60f754158f",
"namespace": "default",
"name": "CustomBatchPin",
"backendId": "018b258a-0c2c-07c0-5d59-50583ae91f1e",
"created": "2023-10-12T20:18:06.012167Z",
"filters": [
{
"event": {
"name": "BatchPin",
"description": "",
"params": [
{
"name": "author",
"schema": {
"type": "string",
"details": {
"type": "address",
"internalType": "address"
},
"description": "A hex encoded set of bytes, with an optional '0x' prefix"
}
},
{
"name": "timestamp",
"schema": {
"oneOf": [
{
"type": "string"
},
{
"type": "integer"
}
],
"details": {
"type": "uint256",
"internalType": "uint256"
},
"description": "An integer. You are recommended to use a JSON string. A JSON number can be used for values up to the safe maximum."
}
},
{
"name": "action",
"schema": {
"type": "string",
"details": {
"type": "string",
"internalType": "string"
}
}
},
{
"name": "uuids",
"schema": {
"type": "string",
"details": {
"type": "bytes32",
"internalType": "bytes32"
},
"description": "A hex encoded set of bytes, with an optional '0x' prefix"
}
},
{
"name": "batchHash",
"schema": {
"type": "string",
"details": {
"type": "bytes32",
"internalType": "bytes32"
},
"description": "A hex encoded set of bytes, with an optional '0x' prefix"
}
},
{
"name": "payloadRef",
"schema": {
"type": "string",
"details": {
"type": "string",
"internalType": "string"
}
}
},
{
"name": "contexts",
"schema": {
"type": "array",
"details": {
"type": "bytes32[]",
"internalType": "bytes32[]"
},
"items": {
"type": "string",
"description": "A hex encoded set of bytes, with an optional '0x' prefix"
}
}
}
]
},
"location": {
"address": "0xb0cd60ade460e797e0c9d206290ac4ed45672c60"
},
"interface": {
"id": "aaa0e410-2b5b-4815-a80a-a18f2ae59f7d"
},
"signature": "BatchPin(address,uint256,string,bytes32,bytes32,string,bytes32[])"
}
],
"topic": "batch-pin",
"options": {
"firstEvent": "oldest"
}
}
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 100.00%. Comparing base (
c00ecf8
) to head (fc18b63
).
Additional details and impacted files
@@ Coverage Diff @@
## main #1418 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 324 325 +1
Lines 23706 24019 +313
==========================================
+ Hits 23706 24019 +313
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
which we used to do, in order to check for duplicates. I'm not sure if this is required or not, but wanted to call it out.
@nguyer (fyi @EnriqueL8) - there is a really good reason why we need this (at least prior to this PR). That's because calculating the signature of a listener is not an easy thing to do - and FireFly doesn't expose any way for code outside of FireFly to do it.
So if you think about the common startup reconcile in most applications, where it does a query-then-create on the listeners it needs, you need to be able to match the things you get back, to the things you need.
That means you need to be able to create the signature for the things you need (not just the things you create).
Currently, we avoid this by having it so that if you try and recreate the new thing with the same signature, it prevents you with a 409. In fact that's important enough that we found and fixed a bug in it recently in #1433.
Now I can imaging this PR might provide a proposal for a fundamentally new way to do this, by pushing the complex reconcile logic inside of listener. But we do need a clear migration path, and to know what the impact would be on apps relying on the old behavior - because without any thought they would duplicate a new listener on every startup.
@peterbroadhurst I'm trying to pick this back up but really not sure what direction to take it. I think part of my struggle is that I myself haven't run into the use case that is driving the need for this functionality, so I'm sort of designing a solution to an unknown problem.
I could add an updated version of the constraint we used to have (and maybe do some heavy DB schema changes to support that) but what would the constraint be? Topic/Location/[SOMETHING]. Is that last element of the constraint the combined signature of each filter in the listener? Do we need to prevent someone from creating the same exact Topic/Location/[Combination of Signatures] or do we want to prevent the same Topic/Location/[Single event signature] from being used in multiple listeners even if the set of filters in each of those listeners is different. Does that question make sense?
I feel like this PR probably needs another round of design discussion before it's ready for more coding.
New approach is that we have introduced a new API at /contract/listeners/hash
that accepts the same payload as a POST. I did think we could have a subset of payload so just the filters and event part, could clean that up but for consistency kept the same. This API will either return the FilterHash for the filters or the deprecated signature for one event and the filters hash as well. Exposes as well the filterHash to be returned from the POST when creating a new listeners and queryable on the GET of listeners to figure out.
PR needs some conflict cleaning and test coverage but test in a FF CLI setup and looking good!
For the second case, we need to include the location if not it will not be unique and will cause problems
@peterbroadhurst Thanks for the comments - will amend the PR with changed.
I think the challenge here is the location:
- The current uniqueness constraints is
location + topic + signature
which makes sense - I think we need to keep the same thing at top level so what I propose is that for:
-
len(filters) == 1
we stay withlocation + topic + signature
. -
len(filters) > 1
we embed all the possible different locations into the hash that becomes the signature and still keep the signature for each event inso queryable in the filter[0] JSON
. The signature then becomeshash(filters[*].signature+filters[*].location)
-
@peterbroadhurst From the review comments and the chances I made, came up with three other worries:
- Filtering over certain fields at the API level will be worse as we are nesting a lot of the fields such as location, eventPath, etc.. under a JSONAny that is stored as TEXT in one column and doesn't have the native filtering. Wondering how bad that would be?
- Have made an opinionated assumption that both Tezos and Fabric only support a single filter for their event streams. I'm pretty certain that Fabric does restrict you to one, @denisandreenko I wonder if you might know if tezosconnect supports multiple filters for the subscriptions?
- Multiple filters that clash with each other... so probably need to check that
@EnriqueL8 tezosconnect doesn't support event streaming yet, so don't worry about compatibility with this connector
@denisandreenko thanks for the input - so a follow up PR might be needed to provide the user with a better error on event streaming because it seems the code does go ahead and try and create a subscription in the Tezos connector
Looks like Docker has been update in the Github Action and the packaged docker-compose version no longer support ContainerConfig
https://github.com/docker/compose/issues/11742#issuecomment-2137583574
The GH action is using V1 for some reason /usr/local/bin/docker-compose up -d
and it should be using v2
. FireFly CLI checks if V1 is available and uses that or V2 if available
This should do the trick https://github.com/hyperledger/firefly-cli/pull/307 - I'll have to update the action to pull in a new version
Multiple filters that clash with each other... so probably need to check that
This comment you made is highlighted, but there's no commentary on where you got to with it @EnriqueL8 . What was your analysis/proposal?
Before I re-review here @EnriqueL8 - can you provide a new summary (either editing the header, or providing it below) of the externals changes.
I referenced one detail point here: https://github.com/hyperledger/firefly/pull/1418/files#r1643404103
... but then I stopped, as I wasn't sure what to base my code review against as a statement of intent.
There seems to be at least two behaviors we could pick between, and I'm not sure which is planned.
Common requirements to option 1/2
think these are locked in as requirements - but tell me if you disagree
- Exactly the same
POST
inputs for listeners that were accepted previously (singleevent
/location
) are accepted after- Optionally can specify
filters
as an array - Cannot specify both
filters
andevent
/location
- error is thrown
- Optionally can specify
Option 1 Migration impact, but efficient on storage
- All DB storage is now done with a
filters
JSON entry containing an array - DB migrations move all
event
/location
entries over tofilters
(but didn't find this in the code, so got confused) -
Migration impact Query APIs that used to specify
?event=
/?location=
fields no longer work (both were JSON) - Query APIs must now work against
?filters=
(still a JSON body, but now containing an array)
Option 2 No migration impact, less efficient in storage
- All three of
filters
,event
andlocation
are always stored in the DB - The
event
andlocation
fields contain a copy offilters[0]
- Query APIs still work against
?event=
or?location=
, but with the caveat that only the first event filter is matched - Query APIs also work against
?filters=
Thanks for the steer @peterbroadhurst!
I had in mind Option 2 as the migration impact of the filtering no longer working doesn't fit in by mind as a 1.3.1 but without the event and location being the filter[0]
. Long term I prefer the Option 1 as it makes it clear that we can deprecate the event
+ location
and just use filters.
I need to wrap my head around Query APIs also work against ?filters=
and how that API would look like before proposing how that works. Is it you provide a JSON of what you want to see with some pattern matching?
?filters=[["location": {
"address": "0xb0cd60ade460e797e0c9d206290ac4ed45672c60"
}]
What does this give you? All the contract listeners that have one filter in that location
Summary of external changes
- Exactly the same POST inputs for listeners that were accepted previously (single event/location) are accepted after
- Optionally can specify filters as an array
- Cannot specify both filters and event/location - error is thrown
The API would look like this:
{
// Old version
"event": {}.
"location": {
"address": "0xb0cd60ade460e797e0c9d206290ac4ed45672c60"
}
// or New version
"filters": [
{
"interface": {
"id": "aaa0e410-2b5b-4815-a80a-a18f2ae59f7d"
},
"eventPath": "BatchPin",
"location": {
"address": "0xb0cd60ade460e797e0c9d206290ac4ed45672c60"
}
}
],
"name": "CustomBatchPin",
"options": {
"firstEvent": "oldest"
},
"topic": "batch-pin"
}
Also I have some proposed code changes to check for duplicate filters in this API, so restrict that. Easily done my checking combination of signature + location! Will commit those up
For ?filters=
- I think you're just missing an entry here: https://github.com/hyperledger/firefly/blob/c2c5a9551c2538cfde3d2ecc59b657cf09d3293f/pkg/database/plugin.go#L1051-L1063
Review with @peterbroadhurst
This is the final proposal for the changes:
Roughly Option 2 + other changes!
- Exactly the same POST inputs for listeners that were accepted previously (single
event/location
) are accepted after- Optionally can specify
filters
as an array - Cannot specify both
filters
andevent/location
- error is thrown
- Optionally can specify
- All three of
filters
,event
andlocation
are always stored in the DB - The
event
andlocation
fields contain a copy offilters[0]
- We write a database migration in golang code at query time to copy over the event and location into the
filters[0]
for existing contract listeners but the signature will not change to the new format! - New signature for each filter with
indexed
fields to have uniqueness when ABI for example has same event signature and location infront of it!- This new signature allows to check for uniqueness of filters in the array
- The format is
I=0,1,2...
where0,1,2
are the position in the event ABI for the fields that are indexed! Empty when no indexed fields present
- We construct a signature based on concatenated signatures of each filter separated by a semi colon
- Query APIs still work against
?location=
, but with the caveat that only the first event filter is matched - Query APIs also work against
?filters=
- Query APIs are enhanced for the signature as it will contain the location and can be regex pattern
How the new API Looks
Request
{
"filters": [
{
"eventPath": "AnotherEvent",
"interface": {
"id": "943bdd06-1233-451c-83a3-d4f5b898db3f"
}
},
{
"eventPath": "Changed",
"interface": {
"id": "943bdd06-1233-451c-83a3-d4f5b898db3f"
}
}
],
"name": "both",
"options": {
"firstEvent": "0"
},
"topic": "both"
}
Response
{
"id": "34d5c0d4-7b8a-49b4-814a-8ae223084bb8",
"interface": {
"id": "943bdd06-1233-451c-83a3-d4f5b898db3f"
},
"namespace": "default",
"name": "both",
"backendId": "01907403-2148-5d62-4bac-301b827f3591",
"created": "2024-07-02T15:14:40.599194Z",
"event": {
"name": "AnotherEvent",
"description": "",
"params": [
{
"name": "from",
"schema": {
"type": "string",
"details": {
"type": "address",
"internalType": "address",
"indexed": true
},
"description": "A hex encoded set of bytes, with an optional '0x' prefix"
}
},
{
"name": "value",
"schema": {
"oneOf": [
{
"type": "string"
},
{
"type": "integer"
}
],
"details": {
"type": "uint256",
"internalType": "uint256"
},
"description": "An integer. You are recommended to use a JSON string. A JSON number can be used for values up to the safe maximum."
}
}
]
},
"signature": "*:AnotherEvent(address,uint256) [i=0];*:Changed(address,uint256) [i=0]",
"topic": "both",
"options": {
"firstEvent": "0"
},
"filters": [
{
"event": {
"name": "AnotherEvent",
"description": "",
"params": [
{
"name": "from",
"schema": {
"type": "string",
"details": {
"type": "address",
"internalType": "address",
"indexed": true
},
"description": "A hex encoded set of bytes, with an optional '0x' prefix"
}
},
{
"name": "value",
"schema": {
"oneOf": [
{
"type": "string"
},
{
"type": "integer"
}
],
"details": {
"type": "uint256",
"internalType": "uint256"
},
"description": "An integer. You are recommended to use a JSON string. A JSON number can be used for values up to the safe maximum."
}
}
]
},
"interface": {
"id": "943bdd06-1233-451c-83a3-d4f5b898db3f"
},
"signature": "*:AnotherEvent(address,uint256) [i=0]"
},
{
"event": {
"name": "Changed",
"description": "",
"params": [
{
"name": "from",
"schema": {
"type": "string",
"details": {
"type": "address",
"internalType": "address",
"indexed": true
},
"description": "A hex encoded set of bytes, with an optional '0x' prefix"
}
},
{
"name": "value",
"schema": {
"oneOf": [
{
"type": "string"
},
{
"type": "integer"
}
],
"details": {
"type": "uint256",
"internalType": "uint256"
},
"description": "An integer. You are recommended to use a JSON string. A JSON number can be used for values up to the safe maximum."
}
}
]
},
"interface": {
"id": "943bdd06-1233-451c-83a3-d4f5b898db3f"
},
"signature": "*:Changed(address,uint256) [i=0]"
}
]
}
Tested this PR E2E with an EVM chain locally, feeling quite good but I'm seeing problems in the E2E tests when creating a listener. Digging into that...
Found a problem with the way I wrote the just in time migration... fixing that
@peterbroadhurst Just working through the changes came across apis/{apiName}/listeners/{eventPath}
which uses the signature to query the DB for the listener... We need to make a decision if this API now makes sense with multiple filters in one listeners and this hierarchy only makes sense for one event or we implement a logic to look in the filters and find out if the event exists
I think it's quite easy to sort it by checking if the whole signature contains the current event signature and location!
Great code coverage is working, fixing
Weird I'm getting 100% locally for coverage
The Postgres migration seem to be failing, looking into it.. It's a shame the GitHub actions do no give you logs of the docker containers, something to add to my list
Replicate locally FF00184: Database migration failed: FF00184: Database migration failed: migration failed: syntax error at or near "MAX" (column 67) in line 4: BEGIN;
Changing it to just VARCHAR which according to the Postgres documentation is unbounded https://www.postgresql.org/docs/current/datatype-character.html
If character varying (or varchar) is used without length specifier, the type accepts strings of any length