urlwatch
urlwatch copied to clipboard
Cache database doesn't always load the latest job state
Currently, cache database finds the latest state of a job by timestamp: int
and tries: int
. This can be unreliable. If a job experienced errors, and was quickly retried again with success, cache could record something like:
id | guid | timestamp | data | tries |
---|---|---|---|---|
1 | ... | 1544755012 | ... | 1 |
2 | ... | 1544755012 | ... | 2 |
3 | ... | 1544755012 | ... | 0 |
Ordering by timestamp then by tries would give the second row as the latest state, whereas in reality the third row is the latest.
Two possible solutions on my mind:
- Save
timestamp
asfloat
rather thanint
. This should give enough precision in most cases, but I'm not sure if it's enough on all systems. - Use
id
to order records.id
seems to be strictly increasing, but this behavior is not documented inminidb
, so I'm also a bit hesitant to use it.
In any case, tries
should not be used for ordering, since it can reset to zero at any point.
This issue is usually not a problem in daily of urlwatch, but is unavoidable in tests, and must be resolved before tests can be improved.
Seems that the behavior of id
is not defined in minidb
, but left to SQLite. Because the AUTOINCREMENT keyword is not used, there's no guarantee that id
is always increasing, so option 2 isn't a viable choice.
It also turns out that using float type timestamp doesn't give enough precision in testing even on my own computer. I guess I'll have to manually put in some wait time in tests to guarantee success.
I don't expect it to be a real issue in normal usage though.
Edit: a better solution probably involves a redesign of the cache database. Timestamp can still be int, but another table would store the id of the latest snapshot of a particular job.