arches icon indicating copy to clipboard operation
arches copied to clipboard

ActivityStream is swamped by tiles during resource creation

Open azaroth42 opened this issue 1 year ago • 8 comments

When creating a resource, every tile that gets created during the process spawns an entry in the activity stream at /history. With only 19 instances, my stream is already at 1200 entries, making it very expensive to consume.

Instead, there should be a Create entry that subsumes all of the subsequent tile edit Update entries.

azaroth42 avatar Jul 12 '23 17:07 azaroth42

Propose that resource.save() should send a flag to the tile.save() function to prevent the creation of the edit_log entry for each tile. Tile edits don't go through resource.save(), so the updates still end up in the history.

Any downsides to this?

azaroth42 avatar Jul 18 '23 15:07 azaroth42

Maybe the activity stream could filter out the tile create entries?

apeters avatar Jul 19 '23 00:07 apeters

Hi @apeters!

I tried that initially, but it's hard to determine which tile edits are part of the creation, and which are legitimate changes. Even after fixing #9769, you can't just remove all tile changes that immediately follow the resource's Create event, as there could be real edits to the resource with no intervening changes to other resources. And if the change post-create was to create a new tile (rather than to update the data in an existing one) it couldn't rely on the distinction between create and update. Perhaps it could trigger off of transactionid though, and only have one entry in the stream per transaction? The calculation of the pages in the stream becomes a challenge unless the filtering happens at the django / postgres level rather than the AS code.

Is there a reason for the tile creates in the log though? Is there any harm in filtering them out? Perhaps with a flag in settings to enable it?

azaroth42 avatar Jul 19 '23 12:07 azaroth42

Those entries are used for transaction reversals in workflows and bulk loader modules: https://github.com/archesproject/arches/blob/03b0b1d6211004fd353952a07473ed52652f00ad/arches/app/utils/transaction.py#L21

Currently the bulk loader only creates tiles following the creation of resource, but we are working on allowing tiles to be appended to existing resources. We could add a setting to skip the creation of some edit records, but it would be important to hide any features that use transaction reversal whenever that flag is set to 'true'.

chiatt avatar Jul 19 '23 18:07 chiatt

Got it, thanks Cyrus! For our current usage, I don't need bulk loading or backing out transactions so I don't think we're blocked in our current hacky workaround, but it would be good to discuss the right solution that allows both to co-exist. Per Alexei's response, having a filter that could distinguish internal tile creation within the resource creation would be great...

https://github.com/archesproject/arches/blob/03b0b1d6211004fd353952a07473ed52652f00ad/arches/app/views/resource.py#L669

But I don't know how to do that :)

azaroth42 avatar Jul 19 '23 19:07 azaroth42

@azaroth42 Just to be clear it sounds like it's not simply tile create events that are the issue (because those could easily be filtered with edittype != "tile create"). What it sounds like you want to exclude is tiles created as part of resource creation, correct? Subsequent tile creation is essentially treated as an update to the resource and is a desired entry in the ActivityStream. Is that right?

apeters avatar Jul 20 '23 17:07 apeters

Right!

If this line passed /something/ that could then be filtered on when constructing the list:

https://github.com/archesproject/arches/blob/master/arches/app/models/resource.py#L228

but rolled together for the transaction / bulk edit usage.

"import tile create" or somesuch?

azaroth42 avatar Jul 20 '23 17:07 azaroth42

The 'note' column is unused. We could document the context of the tile creation there.

chiatt avatar Jul 20 '23 18:07 chiatt