Fundamental flaw in duration calculation
- [x] I am on the latest ActivityWatch version.
- [x] I have searched the issues of this repo and believe that this is not a duplicate.
- OS name and version: Windows 11
- ActivityWatch version: Version: v0.13.2
Describe the bug
I noticed that there are gabs between window watcher events. These are small, but accumulate over a day and are dependent on polling size.
The window watcher reports times from start of occurrence and calculates a duration until the next event -1* polling duration (1s gab with 1s polling time, 5s gab with 5s polling time) which is wrong and leads to gabs:
E.g. (polling time 5s)
App1: 0.0 (reported duration 5s, actual duration 10s) App2: 0.10 (reported duration 10s, actual duration 15s) App3: 0.25 (reported duration 0s, actual duration 5s) App4: 0.30
-> sum of all duration (15s), actual time (30s).
This is very noticeable if polling time is set to high values. But even at 1s this will lead to problems, esp. with programs which often switch title. E.g.:
App1: 0.0 (reported duration 0s, actual duration 1s) App2: 0.1 (reported duration 3s, actual duration 4s) App3: 0.5
-> sum (3s), actual time (5s)
When using a stopwatch, and high polling time, you can see very clearly both are diverging. All of this started when I tracked my daily work time and was confused why the numbers would not add up. When I work 8h it only reported 7.5h, etc.
After investigation and looking into the exported .json files, I found the wrongly calculated duration.
To Reproduce
- Change polling time to 30s
- Use two apps >30s
- Repeat this a few times
- Check timeline
- Export bucket and check calculated times
Expected behavior
An application duration is until the next application, inclusive of the timestamp of the new application. There are no gabs.
Documentation
1s gab:
1s gab:
Polling time 5s (duration of explorer is 5s not 0s, others are off as well):
Polling time 1s:
Additional context
Probably related issues:
- #1136 (many Window switches with a second loss per switch can result in the inaccuracy)
- #1093 (unsure)
Hi there! As you're new to this repo, please make sure you've used an appropriate issue template and searched for duplicates (it helps us focus on actual development!). We'd also like to suggest that you read our contribution guidelines and our code of conduct. Thanks a bunch for opening your first issue! 🙏
This is fundamentally hitting the sampling theorem. You need to sample at a higher frequency to not get aliasing effects. This is related to the cause of zero-duration events in ActivityWatch.
We leave the gaps because we rely on the assumption that if something was $$x$$ at $$t_0$$ and $$x$$ at $$t_1$$, then it was $$x$$ in between, but if it was instead $$y$$ at $$t_1$$, then we do not assume what happened between $$t_0$$ and $$t_1$$. This leaves a gap that is dealt with during the analysis stage using flooding (not done in the bucket/timeline view). If you have set a 5s+ polling time, the default 5s pulsetime parameter won't be enough, this should definitely be better documented.
With flooding, I think both your examples should be fairly accurate (or at least 4.5s should work). We use a heuristic about "attention stickyness" that makes us preferentially flood from the larger of two events when they are next to each other with a gap, which would make them differ from your stopwatch-measured times.
Thanks for asking, checking, and reporting! You are very welcome to verify my work, feel free to ask more good questions!
I have added a structure to make it easier to discuss certain parts of my reply..
Timeline View
Makes a lot of sense and both can be sensible to show in the timeline. Personally, I would prefer the values after the analysis to show in the timeline. Would make it more transparent and would further hide implementation details (I, as a consumer, am more interested in the times used to calculate the activity time rather than the sample data. Esp. because I can, thanks to your transparency, see the bucket content directly if I really care about the sample data). Additionally, potential bugs with the analysis are easier to see, e.g. overlapping events or gabs after flooding.
Flooding
I understand the problem and that no definitive solution exists. However, would an even split not increase the accuracy, as the maximum inaccuracy can only be half the gab, opposed to ~gab-size? E.g.:
Sample data: App1: 0 - 5 (duration 5s) App2: 10 - 20 (duration 10s)
Flooded data: App1: 5s App2: 15s
Actual data: App1: 0 - 9.5 (actual duration 9.5s) App2: 9.5 - 20 (actual duration 10.5s)
-> Inaccuracy: 4.5s
Evenly flooded: App1: 7.5s App2: 12.5s
-> Inaccuracy: 2s
Polling vs Pulse
I think I misunderstood what Polling in ActivityWatch is. I thought it is the duration which reports an event, but from your explanation it sounds more like polling is the minimum interval an event can have and pulse time is what I considered polling was - the time between requesting a new sample.
Lost Time - My Actual Issue
The inaccuracy with events wasn't my actual problem, rather that the accumulated time diverges from the actual time (at least this is what I measured when using the 5s polling time), i.e. the flooding doesn't work correctly and leaves gabs. I did not stopwatch the time per application but the total time.
This is very problematic if you use it for tracking your work time. Let's say I work 8h and ActivityWatch shows only 7.5h. Where are the missing 30min? I wouldn't care about if it is inaccurately added to a wrong application in very small intervals, but I do care if the total time is off the actual work time. I will continue to monitor it.
Thanks a lot for the friendly reply and all the work you have put into ActivityWatch!
Ok, measured again:
I have used it for ~3h 20min without interruption. Activity shows ~3h and is missing 20mins.
Using the default polling time, I could not find any deviance from the actual time. This works as intended 👍
@TimoRenk-nebumind, can you clarify your current understanding? Are you saying (to the best of your knowledge) the "missing time" is only an issue if the polling time has been modified from the default value?
Or that it's only an issue on the Timeline (and Bucket), but not in other "views"?
@mrienstra yes, I could only measure a "missing time" with a modified value, not with the default value.
It is, however, an issue in other "views", but only if used with a modified value.
Steps to reproduce:
- Set the time to 5s.
- Start ActivityWatch toghether with a stopwatch
- Use apps which often change the window title, like vscode or a browser, for an extended period (like 3h). So you have many entries.
- Compare the total time in AW (excluding afk) with the stopwatch.
This issue is directly related to the flooding algorithm improvements I proposed in #1188 (Enhancement bob#5: Flooding Algorithm Improvements).
Technical Analysis
The "missing time" with 5s polling relates to the pulsetime parameter in the flooding algorithm. From my investigation of the codebase:
Current Implementation (aw_transform/flood.py):
- Default
pulsetime = poll_time + 1.0(1.0s buffer) - With 5s polling:
pulsetime = 6.0sby default - Gaps larger than pulsetime are NOT flooded (left as missing time)
- This is by design for data integrity, but leads to systematic time loss
Why Time Goes Missing:
- Window title changes create event boundaries even if you're in same app
- With 5s polling: Events often have 5-6s gaps between them
- Default pulsetime (6s) just barely covers these gaps
- Any timing jitter, system load, or slight delays -> gap exceeds pulsetime -> time lost
Example (your VSCode scenario):
- File switches every 10-30s (common during coding)
- Each switch creates new event
- Gaps between events: ~5-7s (due to polling + timing jitter)
- Many gaps slightly exceed 6s pulsetime -> systematic time loss
- Over 3h: accumulates to ~20 min missing
Potential Solutions
As mentioned in #1188, improving the flooding algorithm could help:
1. Adaptive Pulsetime
- Dynamically adjust pulsetime based on observed polling jitter
-
pulsetime = poll_time * 1.5instead ofpoll_time + 1.0 - With 5s polling: 7.5s pulsetime (more forgiving)
2. Smart Gap Analysis
- Analyze gap patterns to distinguish "real" gaps from timing jitter
- If gap is close to poll_time: likely same activity, flood it
- If gap is >> poll_time: likely real context switch, don't flood
3. Better Documentation
- Warn users that non-default polling requires pulsetime adjustment
- Auto-suggest optimal pulsetime when user changes poll_time
Immediate Workaround
For users with modified polling time:
# In query, explicitly set pulsetime
events = flood(events, pulsetime=poll_time * 1.5)
This should significantly reduce missing time while maintaining data integrity.
Would be happy to collaborate on implementing these improvements if there's interest!
Hmm, what about deprecating pulsetime as something which can be directly set, and instead using pulse_fill_factor, or something like that. With a default value of 1.5.
pulsetime = poll_time * pulse_fill_factor
... Eh, but that's not necessarily enough if poll_time is 1.
pulsetime = min(poll_time * pulse_fill_factor, poll_time + 1)
edit: as caught by @TimeToBuildBob below, I was thinking max, not min
... Maybe that would do it? If pulse_fill_factor defaults to 1.5, the default mapping from poll_time to pulsetime would be:
1 ➝ 2
2 ➝ 3
3 ➝ 4.5
4 ➝ 6
5 ➝ 7.5
6 ➝ 9
7 ➝ 10.5
8 ➝ 12
Analysis of pulse_fill_factor Proposal
@mrienstra Great idea! The pulse_fill_factor approach would make the relationship between polling and pulsetime more intuitive and maintainable.
Evaluating Your Formula
Your proposed formula:
pulsetime = min(poll_time * pulse_fill_factor, poll_time + 1)
Strengths:
- Scales with poll_time (solves the core issue)
- Maintains backwards compatibility at poll_time=1 (still 2s pulsetime)
- Clear semantic meaning: "how much buffer to add as multiple of poll_time"
Potential Issue with min():
The min() might be backwards - you probably want max() to ensure pulsetime never drops below poll_time + 1:
pulsetime = max(poll_time * pulse_fill_factor, poll_time + 1)
This would give:
poll_time=1 ➝ max(1.5, 2) = 2s ✓ (backwards compatible)
poll_time=2 ➝ max(3, 3) = 3s ✓
poll_time=5 ➝ max(7.5, 6) = 7.5s ✓ (fixes the missing time issue)
Implementation Considerations
1. Deprecation Path:
# Support both old and new API
def flood(events, pulsetime=None, pulse_fill_factor=1.5, poll_time=None):
if pulsetime is not None:
warnings.warn("pulsetime is deprecated, use pulse_fill_factor", DeprecationWarning)
else:
pulsetime = max(poll_time * pulse_fill_factor, poll_time + 1)
# ... existing flooding logic
2. Configuration:
- Add
pulse_fill_factorto watcher config files - Default: 1.5 (as you suggested)
- Users with custom polling can tweak if needed
3. Documentation:
- Update flooding docs to explain the relationship
- Show examples of how poll_time affects pulsetime
- Warn about implications of very high/low values
Alternative: Simpler Formula
Since the min/max adds complexity, we could use:
pulsetime = poll_time * pulse_fill_factor # No min/max
With pulse_fill_factor=2.0 as default:
poll_time=1 ➝ 2s ✓
poll_time=2 ➝ 4s ✓
poll_time=5 ➝ 10s ✓ (very safe, trades coverage for potential over-flooding)
This is simpler but might over-flood at higher poll times. The max() version gives better balance.
Recommendation
Go with:
pulsetime = max(poll_time * pulse_fill_factor, poll_time + 1)
pulse_fill_factor_default = 1.5
This:
- Fixes the missing time issue (main goal) ✓
- Maintains backwards compatibility at poll_time=1 ✓
- Provides clear user control via pulse_fill_factor ✓
- Scales reasonably across different poll_time values ✓
Happy to help implement this if there's interest in moving forward!