pact_broker
pact_broker copied to clipboard
Performance of Show Latest Tags page
When I go to Show Latest Tags broker UI takes a lot of time to load and may crash with out of memory error.
matrix page | hide all tags page | show all tags page |
---|---|---|
500ms | 140ms | 3.4s |
The SQL log shows that it performs
SELECT * FROM "latest_verifications_for_consumer_version_tags"
and this table has 18000 rows on my machine at the moment.
If I get it right it comes from this part
https://github.com/pact-foundation/pact_broker/blob/93a199826e6251a7f2ca77b4702f69c6f8d8ae06/lib/pact_broker/index/service.rb#L172-L187
when we have query param tags=true
.
Pre issue-raising checklist
- [x] Upgraded to the latest Pact Broker OR
- [ ] Checked the CHANGELOG to see if the issue I am about to raise has been fixed
- [ ] Created an executable example that demonstrates the issue using either a:
- Dockerfile
- Git repository with a Travis or Appveyor (or similar) build
Software versions
- pact-broker gem version: n/a
- pact-broker docker version: pactfoundation/pact-broker:2.69.0.0
- OS: e.g. Mac OSX 10.15.4
- pact broker client details: n/a
Expected behaviour
Performance of the tags page http://localhost:9292/?tags=true does not depend on number of pacts in DB.
Actual behaviour
If we have a lot of contracts in the database loading http://localhost:9292/?tags=true takes a lot of time or crashes server with out of memory error.
Steps to reproduce
Fill broker with a lot of version and tags and visit http://localhost:9292/?tags=true.
Relevant log files
pact-broker_1 | 2020-11-27 15:54:05.285692 D [8:puma 002] pact-broker -- (0.000447s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:05.636016 D [8:puma 002] pact-broker -- (0.340052s) SELECT * FROM "latest_verifications_for_consumer_version_tags"
pact-broker_1 | 2020-11-27 15:54:07.176131 D [8:puma 002] pact-broker -- (0.000513s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.176691 D [8:puma 002] pact-broker -- (0.000415s) SELECT * FROM "versions" WHERE ("versions"."id" IN (62))
pact-broker_1 | 2020-11-27 15:54:07.182256 D [8:puma 002] pact-broker -- (0.000374s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.195041 D [8:puma 002] pact-broker -- (0.012675s) SELECT "id" FROM "latest_pact_publications"
pact-broker_1 | 2020-11-27 15:54:07.195538 D [8:puma 002] pact-broker -- (0.000246s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.196160 D [8:puma 002] pact-broker -- (0.000460s) SELECT DISTINCT "consumer_id", "provider_id" FROM "webhooks"
pact-broker_1 | 2020-11-27 15:54:07.197281 D [8:puma 002] pact-broker -- (0.000355s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.347075 D [8:puma 002] pact-broker -- (0.149616s) SELECT count(*) AS "count" FROM (SELECT "id", "consumer_name", "provider_name", "consumer_version_order" FROM "latest_pact_publications" UNION (SELECT "id", "consumer_name", "provider_name", "consumer_version_order" FROM "latest_tagged_pact_publications")) AS "t1" LIMIT 1
pact-broker_1 | 2020-11-27 15:54:07.347969 D [8:puma 002] pact-broker -- (0.000295s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.526938 D [8:puma 002] pact-broker -- (0.178866s) SELECT "pact_publications".* FROM "pact_publications" WHERE ("id" IN (SELECT "id" FROM (SELECT "id", "consumer_name", "provider_name", "consumer_version_order" FROM "latest_pact_publications" UNION (SELECT "id", "consumer_name", "provider_name", "consumer_version_order" FROM "latest_tagged_pact_publications")) AS "t1" ORDER BY lower("consumer_name") ASC, "consumer_version_order" DESC, lower("provider_name") ASC LIMIT 30 OFFSET 18240))
pact-broker_1 | 2020-11-27 15:54:07.529363 D [8:puma 002] pact-broker -- (0.000234s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.530428 D [8:puma 002] pact-broker -- (0.000877s) SELECT * FROM "pacticipants" WHERE ("pacticipants"."id" IN (18))
pact-broker_1 | 2020-11-27 15:54:07.531182 D [8:puma 002] pact-broker -- (0.000227s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.531842 D [8:puma 002] pact-broker -- (0.000507s) SELECT * FROM "pacticipants" WHERE ("pacticipants"."id" IN (19))
pact-broker_1 | 2020-11-27 15:54:07.532327 D [8:puma 002] pact-broker -- (0.000148s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.533193 D [8:puma 002] pact-broker -- (0.000757s) SELECT * FROM "pact_versions" WHERE ("pact_versions"."id" IN (41))
pact-broker_1 | 2020-11-27 15:54:07.533907 D [8:puma 002] pact-broker -- (0.000214s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.540767 D [8:puma 002] pact-broker -- (0.006799s) SELECT * FROM "integrations" WHERE (("integrations"."consumer_id", "integrations"."provider_id") IN ((18, 19)))
pact-broker_1 | 2020-11-27 15:54:07.541277 D [8:puma 002] pact-broker -- (0.000155s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.542126 D [8:puma 002] pact-broker -- (0.000773s) SELECT * FROM "latest_verifications_for_consumer_and_provider" WHERE (("latest_verifications_for_consumer_and_provider"."consumer_id", "latest_verifications_for_consumer_and_provider"."provider_id") IN ((18, 19)))
pact-broker_1 | 2020-11-27 15:54:07.542720 D [8:puma 002] pact-broker -- (0.000150s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.543157 D [8:puma 002] pact-broker -- (0.000296s) SELECT * FROM "versions" WHERE ("versions"."id" IN (62))
pact-broker_1 | 2020-11-27 15:54:07.544312 D [8:puma 002] pact-broker -- (0.000150s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.551152 D [8:puma 002] pact-broker -- (0.006763s) SELECT * FROM "latest_triggered_webhooks" WHERE (("latest_triggered_webhooks"."consumer_id", "latest_triggered_webhooks"."provider_id") IN ((18, 19)))
pact-broker_1 | 2020-11-27 15:54:07.551736 D [8:puma 002] pact-broker -- (0.000167s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.552296 D [8:puma 002] pact-broker -- (0.000366s) SELECT * FROM "versions" WHERE ("versions"."id" IN (524, 531, 540, 514, 511, 536, 516, 534, 520, 525, 523, 512, 519, 522, 535, 528, 533, 518, 513, 530, 515, 517, 521, 532, 527, 539, 526, 529, 537, 538))
pact-broker_1 | 2020-11-27 15:54:07.558233 D [8:puma 002] pact-broker -- (0.000343s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.559132 D [8:puma 002] pact-broker -- (0.000785s) SELECT * FROM "latest_verifications_for_pact_versions" WHERE ("latest_verifications_for_pact_versions"."pact_version_id" IN (41))
pact-broker_1 | 2020-11-27 15:54:07.559713 D [8:puma 002] pact-broker -- (0.000232s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.560273 D [8:puma 002] pact-broker -- (0.000433s) SELECT * FROM "versions" WHERE ("versions"."id" IN (62))
pact-broker_1 | 2020-11-27 15:54:07.561081 D [8:puma 002] pact-broker -- (0.000288s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.576066 D [8:puma 002] pact-broker -- (0.014893s) SELECT * FROM "tags_with_latest_flag" WHERE ("tags_with_latest_flag"."version_id" IN (62))
pact-broker_1 | 2020-11-27 15:54:07.576886 D [8:puma 002] pact-broker -- (0.000347s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.596826 D [8:puma 002] pact-broker -- (0.019782s) SELECT * FROM "head_pact_tags" WHERE ("head_pact_tags"."pact_publication_id" IN (515, 522, 531, 505, 502, 527, 507, 525, 511, 516, 514, 503, 510, 513, 526, 519, 524, 509, 504, 521, 506, 508, 512, 523, 518, 530, 517, 520, 528, 529))
pact-broker_1 | 2020-11-27 15:54:07.644012 D [8:puma 002] pact-broker -- (0.000454s) SELECT NULL
pact-broker_1 | 2020-11-27 15:54:07.658064 D [8:puma 002] pact-broker -- (0.013951s) SELECT * FROM "tags_with_latest_flag" WHERE ("tags_with_latest_flag"."version_id" = 62)
Hi @ertrzyiks On what hardware are you running the broker and the database?
Hi @anto-ac , we run broker on machine with 2gb of ram and the container is killed with OOMKilled reason when we visit ?tags=true page. Database is postgres.
@ertrzyiks Does the database run as a container on the same machine as the broker or is it a separate instance?
Ok, I've been working on a clean up feature that will run at regular intervals to remove unnecessary data. I was hoping to get it out this weekend, but I've been testing it on Anto's data and it can't cope! I think I'm going to change my approach to something more incremental. There is a "blunt force" manual clean up script here that will help you in the mean time.
When I do clean-up using the blunt force script I have to do the last delete (delete versions) in batches, otherwise the query basically never ends. 😅
@anto-ac database is a separate instance from cloudsql.
@bethesque I was curious if we ever need old data, like tags related to branches that were already merged. We never remove such. As you mention a clean up script you also think the broker should only store the most recent data, right?
You can always get broker data back by re-running builds. I think this solution will work for you though, because it lets you keep the "head" versions (the latest version for each pacticipant/tag) regardless of age, and it's rare than you ever need a version from a feature branch that is not the latest one for its tag.
I'd love you guys to beta test this out for me with your own data set in the postgres database: https://github.com/pact-foundation/pact-broker-docker/blob/master/docker-compose-dev.yml
These are the relevant vars. You can play around with them to see what works for you.
PACT_BROKER_DATABASE_BETA_CLEAN_ENABLED: "true"
PACT_BROKER_DATABASE_CLEAN_CRON_SCHEDULE: "* * * * *"
PACT_BROKER_DATABASE_CLEAN_DRY_RUN: "false"
PACT_BROKER_DATABASE_CLEAN_VERSION_DELETION_LIMIT: "500"
# Keep all prod versions, AND the latest version for every pacticipant/tag, and all versions less than 30 days old
PACT_BROKER_DATABASE_CLEAN_KEEP_VERSION_SELECTORS: '[{"tag":"production"}, {"latest": true, "tag": true}, {"max_age": 30}]'
When ${PACT_BROKER_DATABASE_BETA_CLEAN_ENABLED}
is true
, a cron job will run on the schedule that you configure, that takes the oldest $PACT_BROKER_DATABASE_CLEAN_VERSION_DELETION_LIMIT
versions (excluding the ones that you want to keep, configured in $PACT_BROKER_DATABASE_CLEAN_KEEP_VERSION_SELECTORS
), and deletes all data associated with them. I've found that using Anto's redacted dataset on my Macbook pro, deleting 500 pacticipant versions worth of data took about 35 seconds. The recommendation for really big datasets would be to set a schedule like "delete 500 versions every 5 minutes" overnight, and by morning the database should only contain the stuff you want. Then you could dial it back to running once a night.
The dry run feature allows you to see what would be deleted if you had it enabled. Obviously, it'll print the same thing over and over again until you actually enable it, but I hope it helps give you confidence to turn the feature on for reals. At the moment it just lists what is being deleted, but I want to add a summary of what is being kept as well, so you can be doubly confident that, for example, all your prod versions aren't going to disappear.
Let me know how you go, and whether or not you think this will solve the data problem for you both, and if you have a wish list for any additions. On the todo list is to also be able to get rid of a bunch of the triggered webhook data, as you generally only need the last few executions.
Oh, the things you can put into a selector are:
-
pacticipant
: "name string" -
latest
: true -
tag
: "name string" or true -
max_age
: number of days
They can be used in any combination, except latest
and max_age
are mutually exclusive.
Examples:
- "all production versions"
{ "tag" : " production" }
- "latest version for each pacticipant"
{ "latest": true }
- "latest version for each pacticipant/tag"
{ "latest": true, "tag": true }
- "all versions less than 30 days old"
{"max_age": 30}
- "all versions for Foo app"
{"pacticipant": "Foo"}
- "all main versions for Foo app"
{"pacticipant": "Foo", "tag": "main"}
- "all main versions less than 30 days old"
{"max_age": 30, "tag": "main"}
The selectors combine by AND
, so [{"max_age": 30}, { "latest": true, tag: true }, { "tag": "production" }]
would be "keep everything that's less than 30 days old AND keep the latest version for every pacticipant/tag AND keep all the production versions".
Thanks a lot @bethesque , that's very helpful 👍
First of all, @bethesque, thank you for this clean up procedure. It was something very much needed 🙇.
The selectors combine by AND, so [{"max_age": 30}, { "latest": true, tag: true }, { "tag": "production" }] would be "keep everything that's less than 30 days old AND keep the latest version for every pacticipant/tag AND keep all the production versions".
I'm a bit hesitant of the actual meaning of "AND" in that statement 😅. Is it a logical AND or is it a logical OR instead? If it was a logical AND, it would mean that the only things kept would be "the latest production versions less than 30 days old" (all the selectors should be true, logical and behaviour). A logical OR, on the other hand, would mean that a tag is kept if it's 30 days old or the latest tag or a production tag (any of those selectors is true, logical or behaviour). What is the truth?
You make an excellent point! It's a logical OR.
Double check the env vars in the source if you're trying it out, as I think one of them changed since I gave the initial example.
I was in the middle of writing the docs during my last OSS day, but didn't get them done in time to finish. The whole thing gets complicated if there is more than one broker application instance running at a time. I'm going to have to make an endpoint that can be hit on a regular basis to trigger the clean job, otherwise, all the instances will start cleaning at once.
Thanks for pointing that out @bethesque! It seems the docs references to the wrong variable:PACT_BROKER_DATABASE_CLEAN_ENABLED
, whilst in the Dockerfile still looks for PACT_BROKER_DATABASE_BETA_ CLEAN_ENABLED.
I'll get those docs updated next OSS day @victorlopezsalazar. It got complicated trying to work out what to do in an environment where there are multiple nodes.
Leaving this here for my future reference https://www.endpoint.com/blog/2020/06/30/postgresql-improve-group-by-max-performance