Revert "darwin: process' running time should not depend on mach ticks"
This reverts commit 802494f65f321ba78e49685760850da41701b89a, which caused the issue #1638.
Sorry for the ping here but I think it would be really good to get this merged and released asap, possibly as a hotfix sidestepping normal release schedule: The latest htop is very and subtly broken (CPU % too small by factor 40) for a large proportion of users (all Apple silicon users who've installed/updated to the latest htop released 4 weeks ago).
@aestriplex, @corneliusroemer Can we have a better fix than reverting everything to the previous state of brokenness?
@fasterit I quote here my last comment to his #1638
The
mach_timebase_infoon Apple Silicon sets the structure correctly (125/3). Everything seems ok, both the cpu percentage and the running time (compared to top). Even ifisRunningUnderRosetta2should always be false, because the program is compiled for the right architecture each time, probably we can leave those controls there, just in case. I'll open a pr with the commit reverted
As I said there, in order to improve this bugfix a bit, the code that was removed without the rosetta compatibility check could be reintroduced (see my last commit in this PR). This would simplify the code and allow us to remove a function while preserving the old behavior
What was broken before? It seems the current state is worse than before and unintended. So it should be reverted no matter what?
How bad was issue https://github.com/htop-dev/htop/issues/1619?
CPU % being ~40x too small is definitely a bug issue. I can't quite tell how much off the process runtime was in the issue whose fix introduced the CPU usage bug.
You shouldn't block a revert by saying the fix brings back the original problem if the new problem is worse.
There are several comments from my side that point out issues that the revert would introduce. Like some division by zero issue and a request for clarification for some comment. AFAICS these have not been addressed yet.
The reason we would like to avoid plainly reverting the fix is that both behaviours are broken and instead of switching back to the old brokenness it would be better to look at the root cause and fix that instead. If reverting part or all of the previous fix is part of that solution there's nothing stopping that revert from being accepted. But as it stands ATM we are far from even knowing the root cause that needs fixing.
Development should consist of steps forward, instead of going back and forth …
@BenBE Sorry, I was afk for the easter holidays. I totally agree on that. The fact is that for certain architectures (e.g. Apple Silicon) the old implementation was closer to top command. I am not sure why, since the mach apis should return the time in nanoseconds, not in cpu cycles, so no conversion should be needed. Anyway, it may took time to figure out exactly where the issue is (i.e. why this workaround - that was made for rosetta compatibility - is needed)
@aestriplex No problem. I was mostly explaining to @corneliusroemer the reasoning behind the current procedure and why the revert is unlikely to be merged as-is. If figuring out the root cause takes time, so be it.
While I understand that this revert re-introduces a bug that was fixed by it, I wonder how many are affected by it vs the issue it's causing?
RIght now, the current htop release could be considered broken for apple silicon, so I guess fixing this makes sense, then re-tackling the original issue that may not be as high-impact as the current one.
The objection to the revert is not about re-introducing the bug it had previously, but rather that we'd like to avoid a back-and-forth and instead focus on actually resolving the issue.
Thus while reverting the specific commit may alleviate the pain short-term, it provides no way forward; which is what we'd like to see. You're still open to revert that specific commit in your local builds though.
proc_taskinfo is not in nano-seconds. See this:
https://github.com/apple-oss-distributions/shell_cmds/blob/14a2692995de90c2fde0c3c1e639a8c7d9e2312d/systime/systime.c#L313
On Intel based macOS:
info.numer: 1 info.denom 1
Platform_nanosecondsPerMachTick: 1.000000
On arm64 based macOS:
info.numer: 125 info.denom 3
Platform_nanosecondsPerMachTick: 41.666667
The reason the rosetta fix is needed is because rosetta translates mach_timebase_info to match that of Intel based macs, and at the same time the proc_taskinfo is not translated.
I have submitted a new pull request that should get rid of any precision issues that would show up because of the the 52 bit mantissa in the double conversions (should be at about 52 days of CPU time on Intel based macs, 6 years on arm64 based macs): https://github.com/htop-dev/htop/pull/1691
Closing this as superseeded by #1691 …