Don't Depend on GNU Date
https://github.com/canonical/cloud-init/pull/4347 added support for cloud-init analyze dump on FreeBSD, which might not have GNU Date installed. This raises the point that shelling out once per line isn't efficient, and this is something that we should be able to do in Python code. This will improve portability as well.
The codepath that uses date isn't actually taken on Ubuntu. If possible it might be best to transitionally log a warning anytime parse_timestamp_from_date() is called to gather the date format so that we can implement this in Python.
For this issue, I agree logging a warning to the console on encountering an unsupported timestamp asking for a bug to be filed would be helpful so we can add the python logic to cope with other formats on different distributions better.
I agree it's not shell out and invoke date command on systems but I also do not think we should look to prioritize exploring more efficiency improvements beyond suggesting that the user file an issue.
I think investing more time in generic efficiency improvements with the analyze tool show have low-priority for the following reasons:
- it's a tool that is not part critical boot chain for a system: It is primarily used for either human-driven triage or capturing logs in integration tests so it may not be worth prioritization of this efficiency for a debugging/triage tool
- the size of the time cost is not "large" in human terms: cloud-init analyze total time on a system with 4 reboots is around 0.3 seconds when shelling out to
dateand 0.16 seconds when parsing the date in python. 0.3s feels like a reasonable amount of time for a person to wait when collecting logs for triage or CI runs. - the parse_timestamp function being called is not called on every line of a log, but on every timed event which is represented by the start/finish bookend logs. So in a cloud-init.log with 4 boots, the log file is 1996 lines yet parse_timestamp is called 256 times.
I also do not think we should look to prioritize exploring more efficiency improvements beyond suggesting that the user file an issue.
Agreed, compatibility and eliminating dependencies are the primary reasons here, which I did not express well in the initial report.