Bug: Jobs do not mount the data dir, causing logs to be lost
Bug description
In (at least) Tutor Local and Dev the Docker Compose file does not mount the data dir as it does in the services. This means that when do commands are run their tracking logs don't persist and when commands like Aspects' backfill are run they cannot find the existing tracking logs.
Additionally regular logging will not be persisted, potentially losing important error data for jobs run on a schedule
I don't know the reasoning behind the decision, but if it's not something we can change that would be good to know as well so we can start trying to work around it.
See: https://github.com/overhangio/tutor/blob/release/tutor/templates/local/docker-compose.jobs.yml#L28 https://github.com/overhangio/tutor/blob/release/tutor/templates/local/docker-compose.yml#L15
FYI, this issue was originally reported in https://github.com/openedx/tutor-contrib-aspects/issues/1027.
Is there a reason to do this in the main docker-compose file? There are ways to do it manually or through a tutor plugin. We will close this as not planned unless there is good reason to do this here.
You can use the command tutor mounts add "lms-job:../../data/lms:/openedx/data" to bind mount the data dir to your lms job container.
Here is the relevant code that would handle the rest: https://github.com/overhangio/tutor/blob/release/tutor/templates/local/docker-compose.jobs.yml#L29-L31 Basically iterate over all the mounts for this particular service and just append them to the list of volumes.
If any plugin relies on this, it can create a docker-compose.jobs.override.yml file so this does not have to be done manually.
https://github.com/overhangio/tutor/blob/release/docs/local.rst?plain=1#L214-L225
It seems odd to throw away logs needed for analytics by default, why should people need to install a plugin to get basic platform data?
I can't recall why we made the decision not to persist job logs, or even if we ever made that decision and it was just an oversight. I don't see any obvious reason that would prevent us from persisting logs. Just a word of notice though: it's important that we do not bind-mount the same host directories in the service and job containers, for the following two reasons:
- multiple containers writing to the same destination is going to cause issues, and so we try to avoid that as much as we can.
- in kubernetes, many providers do not support ReadWriteMany volumes, and we want to preserve some sort of consistency between Compose and Kubernetes.
I.e: I think it shouldn't be an issue to persist job logs, but then we need to make sure that those logs are stored in different files and directories.
@bmtcril Does storing logs in different directory fixes your issue? I am unaware if aspects' backfill command can use that or not. We can use this for job runners:
lms-job:
volumes:
- - ../../data/lms-job:/openedx/data
@mlabeeb03 that would be enough to make things work for us, thank you!
Oops. That was supposed to be a comment, not close the issue. >_>