kinto
kinto copied to clipboard
Event payload attributes on batch requests
This is a follow-up bug of #942
On batch requests, an event is sent for each (parent id / resource / action), and the list of objects is provided in event.impacted_records
. But the content of event.payload
makes very little sense in this context, and can lead to serious confusions:
>>> event.payload
{ 'action': 'update',
'bucket_id': u'staging',
'collection_id': u'gfx',
'record_id': u'fd6a9531-af10-79a0-e645-4249218dbf4f',
'resource_name': 'record',
'timestamp': 1479999631700L,
'uri': u'/buckets/staging/collections/gfx/records/fd6a9531-af10-79a0-e645-4249218dbf4f',
'user_id': 'basicauth:cbd3731f18c97ebe1d31d9846b5f1b95cf8eeeae586e201277263434041e99d1'}
I propose that when there are more than one impacted object then we drop the ${resource_name}_id
and the uri
keys.
Thoughts?
Would this mean that some events will have ${resource_name}_id
and uri
, and others won't, depending on what request triggered them? That sounds bad to me -- I would rather drop them from all events, to have consistent events.
We could remove ${resource_name}_id
and leave uri
with the value of the plural endpoint for example.
From what I've worked with this plugin, I couldn't understand why we need the ${resource_name}_id
fields. Couldn't we just use only uri
all the times?
For instance for attachment you need it to know on which record you want to assign the attachment to.
We can probably have helper that let us get this information from the URI but I like the idea of having them already there in the payload.
If event's are different ones I don't mind having a different payload because a plugin most of the time subscribe to one event at the time, so we won't need different event handling. Also if record_id
is not present it probably means that you don't need it, right?
Well, but are the events different ones? It seems like this proposal is to change update
events to have certain fields if it was a batch request, and to lack those fields if it wasn't a batch request. Wouldn't a plugin want to subscribe to both of those cases, and handle them similarly?
I think there is some confusion here. The problem concerns the events that have several impacted records.
For example, a batch request that «puts» 3 records on the same collection. You'll see that record_id
and uri
in the event payload don't make sense. I suggest that we remove ${resource_name}_id
and leave uri
with the value of the plural endpoint when there is more than one impacted record.
Well, I'm still confused. Let's consider the following cases:
http -j --auth 'user:pass2' PUT localhost:8888/v1/buckets/default/collections/some-collection/records/some-record data:='{"my-field": "my-value"}'
Let's say this generates an event like:
{ 'action': 'update',
'bucket_id': u'default',
'collection_id': u'some-collection',
'record_id': u'some-record',
'resource_name': 'record',
'timestamp': 1479999631700L,
'uri': u'/buckets/default/collections/some-collection/records/some-record,
'user_id': 'basicauth:cbd3731f18c97ebe1d31d9846b5f1b95cf8eeeae586e201277263434041e99d1'}
Now consider:
http -j --auth 'user:pass2' POST localhost:8888/v1/batch requests:='[{"method":"PUT","path":"/buckets/default/collections/my-collection/records/my-record","body":{"data":{"my-field": "my-value2"}}}]'
This request is identical to the first one, so it should generate the same events. Agreed?
You're saying that the second one, if and only if it contains two PUT requests, should be missing record_id
and should have different semantics for uri
. Did I understand that correctly?
My argument is that whoever authored the plugin doesn't care whether the events came from someone making a batch request with two PUTs, or someone making two separate PUT requests. The events should be the same, and have the same semantics, in both cases. So if you want to remove the record_id
from the second case, you should remove it from the first case too. But maybe I'm missing a point somewhere.
In investigating this, I noticed that the timestamp
argument to notify_resource_event
is only present in the payload
, but that doesn't really make any sense (for all the same reasons as the above). Do we need to keep the timestamp
? What do we use it for?
OK, never mind that last comment -- I think I confused myself. For a given (resource_name, parent_id, action)
, the timestamp
should correspond to the parent_id
's collection_timestamp
, so "must" be the same for all impacted_records
.