directus
directus copied to clipboard
Performance issues with revisions and activity
Checklist
- [X] I'm using the latest version of Directus
- [X] There's no other issue that already describes the problem.
- [X] I've completed all Troubleshooting Steps.
Describe the Bug
There is a problem with the reveisions and activity queries.
When I navigate in my directus instance, I regularly get a popup that says that a request failed with status 504. When I look in my devtools, it's the requests fetching the revisions for a particular item.
I have to purge the activity and revisions often to avoid that, even though I don't have a particularly big database (there are ~ 1 million in activity and 500 000 in revisions).
My guess is that there's a missing index in the revisions table, because the query is almost always fetching revisions for a particular collection and item : there should be an index on the item column.
To Reproduce
There's not a reliable way I think beause it depends on the volume of revisions and the performance of the DB. However it should be possible to measure a performance boost with an added index.
Hosting Strategy
Self-Hosted (Docker Image)
what database are you using ?
what database are you using ?
I'm not OP but I am having the same issue with Postgres 14 running on AWS RDS.
Same issue here. Revisions take a really long time to load. Have about 700,000 records in the directus_revisions table.
I use two fields for caching stock values from our slow ERP, which are updated every 6 hours. Would be nice to be able to disable revision logging for certain fields.
I added an index on the item column of the directus_revisions table and it greatly improved perfomance. It would be great if it was added by default or possible to enable as a config flag.
On top of the index suggestion we should also have a configuration to limit the amount of revisions per item, so they are rotative. For example, only save last 0, 5, 10, 20, 50 revisions per item. Probably can be configured per collection.
This is because Directus can crash because it does not have sufficient memory (OOMKilled - Out of Memory). It happens
when the machine has limited memory and there's some WYSIWYG, JSON or any other column that can have a great amount of data. Because directus_revisions will store every column of the record in data and will store the huge data column in delta, it can crash sooner than later.
Another thing we can do is truncate data and delta and only retrieve those columns in full when we want to restore a specific version or when we retrieve that single revision 🤔
Partial fix numero uno here is to ensure there's some configuration to control the max retention on those tables. Part of the problem is that it effectively grows infinitely big without limit. #18105 implements two new env vars to control the max retention in time for activity and revisions separately.
Same issue here, ~1,1M records.
For those struggling with issues similar to this (but not quite exactly the one OP opened)
We wanted to clean up the revision/activity table and the estimated time was ~3 hours, ran immediately after the change.
Add an index to column directus_revisions.parent and if you want to clean the table up, also set the ON DELETE action to SET NULL in the same directus_revisions.parent column. To speed up deletions on directus_activity, add an index to directus_revisions.activity too.
These indexes might help with selections too, but I haven't had any issues in this particular case.
For reference (use at your own risk):
CREATE INDEX custom_directus_revisions_parent_idx ON public.directus_revisions USING btree (parent);
ALTER TABLE public.directus_revisions DROP CONSTRAINT directus_revisions_parent_foreign;
ALTER TABLE public.directus_revisions ADD CONSTRAINT directus_revisions_parent_foreign FOREIGN KEY (parent) REFERENCES public.directus_revisions(id) ON DELETE SET NULL;
CREATE INDEX custom_directus_revisions_activity_idx ON public.directus_revisions (activity);
Just to confirm, we had the same issue: (i) random 504s (which I could see in the request monitor were very slow or timed-out requests for revisions) and (ii) the list of revisions in the single item sidebar taking ages to load. I can confirm @BenoitAverty and @bevanmw's suggestion to add an index to the item column of the directus_revisions table fixed this for us. Revisions requests are resolved instantly now.
Also want to mention this became more trickier with the addition of Content Versioning.
Now, we cannot simply just remove the records from Activity and Revisions.
We must check if field version is populated in directus_revisions and not remove it if that's the case 😕
When we try to delete a collection, it's freezing the whole project. That's because all the tables under the transaction (see next preview) are locked (due to the transaction) and transaction need to wait until all the records are removed from those (and the other) tables. https://github.com/directus/directus/blob/19598ebb28fb3e9ba0bfd62f8d5002a1e7161e13/api/src/services/collections.ts#L587-L680
Wanted to open a new issue, when I saw this one is in progress. So just adding that any use of LOWER such as in this query will not be able to use indexes. And so I would advise dropping the use of LOWER on possibly all the fields except the user_agent and comment field.
If possible, standardize the case of the values in these columns when data is inserted or updated. This way, you can avoid using LOWER() during querying.
SELECT
`directus_activity` . `action`,
`directus_activity` . `collection`,
`directus_activity` . `timestamp`,
`directus_activity` . `id`,
`directus_activity` . `user`
FROM
`directus_activity`
WHERE
( `LOWER` ( `directus_activity` . `action` ) LIKE ?
OR `LOWER` ( `directus_activity` . `user` ) LIKE ?
OR `LOWER` ( `directus_activity` . `ip` ) LIKE ?
OR `LOWER` ( `directus_activity` . `user_agent` ) LIKE ?
OR `LOWER` ( `directus_activity` . `collection` ) LIKE ?
OR `LOWER` ( `directus_activity` . `item` ) LIKE ?
OR `LOWER` ( `directus_activity` . `comment` ) LIKE ?
OR `LOWER` ( `directus_activity` . `origin` ) LIKE ? )
AND ( ( ( `directus_activity` . `user` IS NOT NULL )
OR `directus_activity` . `user` = ? ) )
ORDER BY
`directus_activity` . `timestamp` DESC
LIMIT
?```
Just a small addition, I think the correct index here would be something like
CREATE INDEX directus_revisions_collection_item ON directus_revisions (collection, item);
because revisions are almost always fetched with collection&item in the WHERE clause,
Closing this in favor of a new issue that outlines the action item to resolve this: https://github.com/directus/directus/issues/23166