cloudstack
cloudstack copied to clipboard
Display dates in Quota and Usage messages according to the timezone configurations
Description
When using the Usage and Quota plugins, operators must deal with 4 different timezone configurations:
- A: the global setting
usage.execution.timezone
; - B: the global setting
usage.aggregation.timezone
; - C: the timezone of the operating system;
- D: the timezone of the database (GMT).
These configurations are used by the Usage and Quota plugins in the following way:
-
A
defines the timezone of the job's execution; -
B
defines the timezone of validations performed during the execution of the job; - Dates are written in logs and messages according to
C
; - Data is stored in the database according to
D
.
Handling these multiple timezones can lead to confusion when examining logs and data. To simplify this complexity, this PR adjusts the logs and exception messages of the Usage and Quota plugins. Now, dates are displayed according to the appropriate configuration for each situation: usage.aggregation.timezone
for messages related to data processing and APIs, and usage.execution.timezone
for logs related to job scheduling. This modification reduces the number of timezone configurations operators need to manage to 3.
Types of changes
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Bug fix (non-breaking change which fixes an issue)
- [X] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
- [ ] Major
- [X] Minor
Screenshots (if appropriate):
How Has This Been Tested?
In a server in GMT+8, I set usage.execution.timezone
to GMT+8 and usage.aggregation.timezone
to GMT+11. Then, I verified that the logs and the APIs displayed dates according to the timezone configurations:
024-01-24 02:45:49,079 DEBUG [o.a.c.q.QuotaManagerImpl] (qtp391630194-19:ctx-f8be50a5 ctx-65a023c1) (logid:4347e9c1) Validating usage record [{"id":30567,"usageId":38,"usageType":2,"startDate":"2024-01-24T04:00:00+1100","endDate":"2024-01-24T04:59:59+1100"}] for {"accountName":"admin","domainId":1,"id":2,"uuid":"b88e1a97-771e-11ee-8e59-5254003754dc"} against [1] quota tariffs.
2024-01-24 02:45:49,081 DEBUG [o.a.c.q.QuotaManagerImpl] (qtp391630194-19:ctx-f8be50a5 ctx-65a023c1) (logid:4347e9c1) Quota tariff [{"name":"ALLOCATED_VM","usageName":"ALLOCATED_VM","uuid":"9f3f1434-771e-11ee-8e59-5254003754dc","startDate":"2010-05-04T11:00:00+1100","endDate":"null"}] does not have an activation rule, therefore we will use the quota tariff value [0.00000] in the calculation.
2024-01-24 02:45:49,081 DEBUG [o.a.c.q.QuotaManagerImpl] (qtp391630194-19:ctx-f8be50a5 ctx-65a023c1) (logid:4347e9c1) The aggregation of the quota tariffs resulted in the value [0.00000] for the usage record [{"id":30567,"usageId":38,"usageType":2,"startDate":"2024-01-24T04:00:00+1100","endDate":"2024-01-24T04:59:59+1100"}]. We will use this value to calculate the final usage value.
2024-01-24 02:45:49,082 DEBUG [o.a.c.q.QuotaManagerImpl] (qtp391630194-19:ctx-f8be50a5 ctx-65a023c1) (logid:4347e9c1) Calculating value of usage record [{"id":30567,"usageId":38,"usageType":2,"startDate":"2024-01-24T04:00:00+1100","endDate":"2024-01-24T04:59:59+1100"}] for account [{"accountName":"admin","domainId":1,"id":2,"uuid":"b88e1a97-771e-11ee-8e59-5254003754dc"}] according to the aggregated quota tariffs value [0.00000] and its usage unit [Compute*Month].
2024-01-24 02:45:49,083 DEBUG [o.a.c.q.QuotaManagerImpl] (qtp391630194-19:ctx-f8be50a5 ctx-65a023c1) (logid:4347e9c1) The calculation of the usage record [{"id":30567,"usageId":38,"usageType":2,"startDate":"2024-01-24T04:00:00+1100","endDate":"2024-01-24T04:59:59+1100"}] for account [{"accountName":"admin","domainId":1,"id":2,"uuid":"b88e1a97-771e-11ee-8e59-5254003754dc"}] according to the aggregated quota tariffs value [0.00000] and its usage unit [Compute*Month] resulted in the value [0E-9].
...
2024-01-24 02:46:07,032 DEBUG [o.a.c.q.QuotaManagerImpl] (qtp391630194-19:ctx-f8be50a5 ctx-65a023c1) (logid:4347e9c1) Aggregating the account [{"accountName":"admin","domainId":1,"id":2,"uuid":"b88e1a97-771e-11ee-8e59-5254003754dc"}] credit entries before [2024-01-24T04:00:00+1100].
2024-01-24 02:46:07,032 DEBUG [o.a.c.q.QuotaManagerImpl] (qtp391630194-19:ctx-f8be50a5 ctx-65a023c1) (logid:4347e9c1) The aggregation of the account [{"accountName":"admin","domainId":1,"id":2,"uuid":"b88e1a97-771e-11ee-8e59-5254003754dc"}] credit entries before [2024-01-24T04:00:00+1100] resulted in the value [0].
...
2024-01-24 02:46:11,105 INFO [o.a.c.q.QuotaManagerImpl] (qtp391630194-19:ctx-f8be50a5 ctx-65a023c1) (logid:4347e9c1) Processing quota balance for account [{"accountName":"fabricio","domainId":1,"id":4,"uuid":"03e923c1-7d2a-4f31-8cb5-8e2f3a8c3121"}] between [2024-01-24T04:00:00+1100] and [2024-01-24T04:59:59+1100].
2024-01-24 02:46:11,150 DEBUG [o.a.c.q.QuotaManagerImpl] (qtp391630194-19:ctx-f8be50a5 ctx-65a023c1) (logid:4347e9c1) Account [{"accountName":"fabricio","domainId":1,"id":4,"uuid":"03e923c1-7d2a-4f31-8cb5-8e2f3a8c3121"}] has [0] credit entries before [2024-01-24T04:00:00+1100].
...
2024-01-24 03:13:55,334 DEBUG [c.c.u.UsageServiceImpl] (qtp391630194-18:ctx-fc75d7c5 ctx-8a335807) (logid:ba5815a0) Getting usage records for account ID [2], domain ID [null] between [2024-01-22T19:00:00+1100] and [2024-01-23T07:59:00+1100] using page size [500] and start index [0].
...
2024-01-24 09:01:06,700 INFO [cloud.usage.UsageManagerImpl] (Usage-Job-1:null) (logid:) Parsing usage records between [2024-01-24T11:00:00+1100] and [2024-01-24T11:59:59+1100].
...
2024-01-24 10:00:42,822 DEBUG [usage.parser.VMInstanceUsageParser] (Usage-Job-1:null) (logid:) Creating VM usage record for vm [vm-a], type [2], usage [1], startDate [2024-01-24T12:00:00+1100], and endDate [2024-01-24T12:59:59+1100], for account [2].
...
2024-01-24 10:00:43,516 DEBUG [usage.parser.VolumeUsageParser] (Usage-Job-1:null) (logid:) Creating Volume usage record for vol [14], usage [1], startDate [2024-01-24T12:00:00+1100], and endDate [2024-01-24T12:59:59+1100], for account [2].
...
2024-01-25 00:27:10,788 INFO [cloud.usage.UsageManagerImpl] (main:null) (logid:) Usage is configured to execute in time zone [GMT+08:00], at [01:32], each [60] minutes; the current time in that timezone is [2024-01-25T00:27:10+0800] and the next job is scheduled to execute at [2024-01-25T01:32:00+0800]. During its execution, Usage will aggregate stats according to the boundaries of a day in time zone [GMT+11:00].
...
2024-02-01 02:14:32,361 DEBUG [o.a.c.a.r.QuotaResponseBuilderImpl] (qtp391630194-21:ctx-48906f8f ctx-2b050ee1) (logid:5595bd3e) Listing quota tariffs for parameters [{"listAll":false,"name":null,"usageType":null,"page":null,"pageSize":null}] between [2020-01-01T00:00:00+1100] and [2028-01-01T23:59:59+1100].
(admin) π± > quota tariffcreate enddate='2024-01-20T02:00:00+1100' startdate='2024-01-21T03:00:00+1100' name="testdates" usagetype=1 value=1
π Error: (HTTP 431, error code 4350) The quota tariff's start date [2024-01-21T00:00:00+1100] cannot be less than now [2024-01-25T04:39:48+1100].
(admin) π± > quota tariffcreate enddate='2024-01-25T02:00:00+1100' startdate='2024-01-26T03:00:00+1100' name="testdates" usagetype=1 value=1
π Error: (HTTP 431, error code 4350) The quota tariff's end date [2024-01-24T23:00:00+1100] cannot be less than the start date [2024-01-26T00:00:00+1100].
(admin) π± > quota credits value=10 account=admin domainid=72b54c4a-771d-11ee-8e59-5254003754dc
π Error: (HTTP 431, error code 4350) Incorrect deposit date [2024-01-25T01:34:00+1100], as there are balance entries after this date.
Codecov Report
Attention: Patch coverage is 31.49606%
with 87 lines
in your changes are missing coverage. Please review.
Project coverage is 30.96%. Comparing base (
49cecae
) to head (77cde46
). Report is 59 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #8230 +/- ##
============================================
+ Coverage 30.37% 30.96% +0.58%
- Complexity 32633 33474 +841
============================================
Files 5352 5355 +3
Lines 374419 375779 +1360
Branches 54609 54889 +280
============================================
+ Hits 113719 116349 +2630
+ Misses 245523 244012 -1511
- Partials 15177 15418 +241
Flag | Coverage Ξ | |
---|---|---|
simulator-marvin-tests | 24.80% <7.08%> (+0.66%) |
:arrow_up: |
uitests | 4.36% <ΓΈ> (-0.02%) |
:arrow_down: |
unit-tests | 16.58% <28.34%> (+0.16%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Hey @winterhazel , thx for the PR.
My view is the following:
- some operators will want to be able to have different time zones, due to whatever reasons (consider global cloud operator, having a cloud in various time zones) - there is probably a reason someone implemented 2 different time zones settings for the Usage originally.
- you are mentioning also GMT times in logs and different DB time, but I don't see this is touched (I'm not saying it should be touched, but I see you mentioned this)
I don't think we can/want to accept changes in time zones which are not configurable. At the minimum, we want to keep the current behaviour and make all those time zones changes optional/configurable - if this is something you can do, that would be a nice change, but hard-changing how things operate currently - is something I don't think cloudstack users will want.
In case I have missed/misunderstood something, please let me know.
- some operators will want to be able to have different time zones, due to whatever reasons (consider global cloud operator, having a cloud in various time zones) - there is probably a reason someone implemented 2 different time zone settings for the Usage originally.
@andrijapanicsb, the existing configurations (usage.execution.timezone
and usage.aggregation.timezone
) are global; therefore operators are not enabled to define it by zone. The use case use you mentioned would already face problems with the current configurations. Besides that, even if you have the Usage service distributed along several zones, due to how Usage was designed, only one instance will process all the pending data; therefore, operators do not have the flexibility to process in different time zones with the current workflow. Perhaps we could work on an extension to make it configurable by zone; however, it is out of the scope of this PR.
I did not find any documentation on why there are 2 time zone configurations for the same workflow: usage.execution.timezone
to execute the job in a time zone and usage.aggregation.timezone
to perform the validations in another. However, for all the environments I have seen, configuration usage.execution.timezone
is either null, which defaults to GMT (in this case usage.aggregation.timezone
is also configured with GMT, which is its default value), or have the same value of usage.aggregation.timezone
. In both cases, to avoid misunderstandings, operators have the same value for both configurations.
you are mentioning also GMT times in logs and different DB time, but I don't see this is touched (I'm not saying it should be touched, but I see you mentioned this)
In Java, a Date
object does not store a time zone; when printing the object, it will be printed according to the time zone of the machine that is running the code. Take for instance the date in the logs printed without the changes of this PR, in a server that is configured with GMT
the configurations have the value GMT-3
:
2023-09-27 20:50:28,710 INFO [cloud.usage.UsageManagerImpl] (Usage-Job-1:null) (logid:) Parsing usage records between Tue Aug 01 20:27:40 GMT 2023 and Wed Sep 27 20:50:28 GMT 2023
This confuses operators while they are examining the logs since the dates are being printed based on the machine time zone, not based on the ACS configurations.
With the changes of this PR (for instance, https://github.com/apache/cloudstack/pull/8230/files?diff=unified&w=0#diff-fe9d8f8d7c3d8b29cb0195ad950990d706e1fad0397deaf383ae1c3ae9c1d250L490-R495), dates on the logs of the Usage and Quota plugins will be printed based on the configuration time zone. The same log would look like this:
2023-09-27 20:50:28,710 INFO [cloud.usage.UsageManagerImpl] (Usage-Job-1:null) (logid:) Parsing usage records between [2023-08-01T17:27:40-0300] and [2023-09-27T17:50:28-0300].
This PR does not touch any database configuration. When a Date
object is used in a query, CloudStack converts the time zone of the date to the hard-coded value:
https://github.com/apache/cloudstack/blob/c7100c3d751923547d16e04949c186ef8e46f792/framework/db/src/main/java/com/cloud/utils/db/GenericDaoBase.java#L123
https://github.com/apache/cloudstack/blob/c7100c3d751923547d16e04949c186ef8e46f792/framework/db/src/main/java/com/cloud/utils/db/GenericDaoBase.java#L1624-L1636
This way, along with the two current configurations (usage.execution.timezone
and usage.aggregation.timezone
) and the machine configuration, operators also have to bear in mind that ACS stores data always in GMT, totalizing 4 different time zones to handle. What I meant with "the number of time zone configurations the operator has to handle is reduced to 2" is that, by unifying the configurations and printing the date in the logs based on the unified configuration, the operators will concern only the database hard-coded timezone and what they have configured in the ACS configuration, totalizing only 2 different time zones to handle.
I think what @andrijapanicsb means, @winterhazel , Is that you might have a MS in one timezone and the DC it manages in another (NOTE: this is a real world scanario, crazy as it seems) In this case you want the execution timezon to be the one and the aggregation timezone to be the other.
I think what @andrijapanicsb means, @winterhazel , Is that you might have a MS in one timezone and the DC it manages in another (NOTE: this is a real world scanario, crazy as it seems) In this case you want the execution timezon to be the one and the aggregation timezone to be the other.
@DaanHoogland I see, this does seem to be a valid situation in which having these two separate configurations could prove useful, thanks for pointing it out. In this case, I will keep the changes regarding the logs; however, considering the proper configuration for each situation: usage.aggregation.timezone
for the logs printed during the [Usage and Quota] data processing, and usage.execution.timezone
for the logs printed during the job scheduling.
This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.
@DaanHoogland I have modified this PR, could you take another look at it? Thanks.
minor but this means that in a search in the log the '[]' can no longer be included and false positive might pop up; think account
dahn
and accountdahn-admin
ordahn-south
apart from this one, the messages itself seem ok, @winterhazel . of course i am bound to have missed some. can you have a look @andrijapanicsb ?
This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.
@blueorangutan package
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: βοΈ el7 βοΈ el8 βοΈ el9 βοΈ debian βοΈ suse15. SL-JID 8638
@blueorangutan test
@DaanHoogland a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests
[SF] Trillian test result (tid-9191) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 46279 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8230-t9191-kvm-centos7.zip Smoke tests completed. 129 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:
Test | Result | Time (s) | Test File |
---|
@blueorangutan package
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: βοΈ el7 βοΈ el8 βοΈ el9 βοΈ debian βοΈ suse15. SL-JID 8805
@blueorangutan test alma9 kvm-alma9
@DaanHoogland a [SL] Trillian-Jenkins test job (alma9 mgmt + kvm-alma9) has been kicked to run smoke tests
[SF] Trillian test result (tid-9337) Environment: kvm-alma9 (x2), Advanced Networking with Mgmt server a9 Total time taken: 53557 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8230-t9337-kvm-alma9.zip Smoke tests completed. 129 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:
Test | Result | Time (s) | Test File |
---|
@GutoVeronezi is this ready as is according to you. cc @winterhazel
@GutoVeronezi is this ready as is according to you. cc @winterhazel
@DaanHoogland, there is nothing else from my side. For me it is good to go.