Cache database doesn't always load the latest job state
Currently, cache database finds the latest state of a job by timestamp: int and tries: int. This can be unreliable. If a job experienced errors, and was quickly retried again with success, cache could record something like:
| id | guid | timestamp | data | tries |
|---|---|---|---|---|
| 1 | ... | 1544755012 | ... | 1 |
| 2 | ... | 1544755012 | ... | 2 |
| 3 | ... | 1544755012 | ... | 0 |
Ordering by timestamp then by tries would give the second row as the latest state, whereas in reality the third row is the latest.
Two possible solutions on my mind:
- Save
timestampasfloatrather thanint. This should give enough precision in most cases, but I'm not sure if it's enough on all systems. - Use
idto order records.idseems to be strictly increasing, but this behavior is not documented inminidb, so I'm also a bit hesitant to use it.
In any case, tries should not be used for ordering, since it can reset to zero at any point.
This issue is usually not a problem in daily of urlwatch, but is unavoidable in tests, and must be resolved before tests can be improved.
Seems that the behavior of id is not defined in minidb, but left to SQLite. Because the AUTOINCREMENT keyword is not used, there's no guarantee that id is always increasing, so option 2 isn't a viable choice.
It also turns out that using float type timestamp doesn't give enough precision in testing even on my own computer. I guess I'll have to manually put in some wait time in tests to guarantee success.
I don't expect it to be a real issue in normal usage though.
Edit: a better solution probably involves a redesign of the cache database. Timestamp can still be int, but another table would store the id of the latest snapshot of a particular job.