travis-logs
travis-logs copied to clipboard
Automatically cut off database-readable logs by id & job_id age
The idea here is to introduce an automatic window of time for reading logs from the logs database for purposes of being able to drop older records. With this change, records with id
or job_id
lower than the cutoff will automatically be assumed to be "archived", meaning they will be read from S3 by travis-api. In reality, this tends to happen within ~3h of job completion, so this change is mostly about defining a window of time within which we allow logs to be mutated, as is done via job restart.
- [ ] other humans are OK with this idea
- [ ] we have a plan for if/how to message this via web and cli
@meatballhat can you give a brief description of:
- the existing behaviour
- the proposed behaviour
- the reason for the change
@igorwwwwwwwwwwwwwwwwwwww I was backfilling while you commented. Sorry about the delay.
@renee-travisci I'm sorry for not being more explicit about this, but this change is not intended to alter reading logs data via web/cli, but rather only to change how long we'll allow it to be mutated.
@meatballhat @renee-travisci Ah.... this is interesting, and it makes more sense (thanks for clarifying @meatballhat!). From my experience looking at users' job stats, people generally don't mutate/restart a job more than a few days to a week. Generally, old jobs are hard to find because they get buried on the web ui.
However, conceivably, someone who's not been using Travis much will want to restart a very old job when they resume working on a project. I'm not sure how frequently that happens, if at all, but I assume if we documented the behavior pretty clearly in the docs, people would understand
@acnagy I have a lot of sympathy for folks who need/want to restart a job that ran more than a few months in the past.
We could decide to start with a cutoff of something like 2 years, then maybe tighten it up over time? I suspect much of the potential pain could be avoided if we were to change our default mode of mutating build, job, and log records to instead create new records, but I think that's a more involved change.
Personally I only ever restart jobs when they error, and I gotta get CI green.
@acnagy I think in the example of picking up a project after a number of months it would be pretty unlikely I'm interested in restarting the old stuff. I can't think of a single case like that. Instead I'd move forward and create new commits/builds?
@meatballhat @svenfuchs I think it's a matter of workflow... some people create new commits, and I feel like I've emailed with someone who just jumped in and restarted... Can't remember exactly though...
That said, I'm not sure it's worth supporting the restarting-very-old-builds workflow very much. The problem is, if they restart after a long time, the image/dependencies could have changed, and then they could get new errors... which means they'll end up needing to do more commits anyway. I know the #reproducibility-study people run into these issues... So, I think 6 months is probably a fairly smart cut-off, we just need to document it
Bumping this in the interest of getting documentation sorted out and getting this PR merged.
Where in the docs do you think this belongs?