ci-tools
ci-tools copied to clipboard
`pod-scaler`: Begin to store data in v2 version including timestamp
For DPTP-4069, this PR represents step 1 below:
Upon further research it has been discovered that the memory usage in the consumers has scaled fairly linearly with the overall amount of data that is stored in the GCS buckets. Despite individual datum being pruned to 25 entries, we have still seen significant increase in the amount of data we have stored. As of today, it is well over 3GB of total data stored. This is due to loading and storing usage data for potentially stale identifiers since the inception of this tool (2021). The approach to fixing this problem involves pruning the stale data, which should result in the consumers using significantly less memory. Unfortunately, there is no way to tell which data is stale, and which is newly generated. Due to this, we will have to begin storing new data with timestamps included. Eventually, we will prune data containing timestamps older than a configurable age (beginning at 180 days). The following phases will be taken to migrate from the existing data format to the new format with pruning:
- Begin to store data in the new format in a new bucket "origin-ci-resource-usage-data-v2", as well as continuing to store data using the existing format in the existing bucket "origin-ci-resource-usage-data"
- After some time (~30 days) the consumers will be migrated to use the v2 format
- Upon verifying that the consumers function as designed using the v2 format, the v1 logic and data will be deleted, the code will be simplified by only using v2 logic
- Add pruning logic to prune data older than the configured time
- If necessary, plans for using a persistent datastore can be made and executed, but I believe that the prior steps will make this unnecessary
In order to achieve this functionality I have created v2 producer logic and data types based on v1. This was largely a copy/paste job and then minor changes to the types and logic. During step 3 the v1 logic and types will be removed, and the prior module structure will be returned.
Note to reviewer(s), the 2nd and 3rd commits will be the most useful to review here.
/hold as I will have to coordinate a small update to the producer deployment with this
/test e2e
/test e2e
/test e2e
@smg247: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| ci/prow/security | 3c150d7ce1931257cb8558a7448cd69aaa13b711 | link | false | /test security |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: bear-redhat, smg247
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [bear-redhat,smg247]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/hold cancel