documentation
documentation copied to clipboard
[WIP] Keeping Git branches in sync with job-server workspaces
In the Bristol-Cambridge-Oxford meeting on May 2, 2024 @venexia asked a question about keeping Git branches in sync with job-server workspaces. The question was prompted by an exchange with tech support.^1
We should recognize that it may not be desirable to keep Git branches in sync with job-server workspaces. Nevertheless, this issue captures the question and the exchange with tech support. Ultimately, the intention is to improve the documentation, by making recommendations to researchers.
In the following workflow, the primary branch is GitHub's default branch. It is often called main
. The terms primary branch, primary workspace, secondary branch, and secondary workspace have no meaning beyond this issue. HEAD
is Git-speak for "the current branch's latest commit".
- Researcher creates repo from opensafely/research-template
- Researcher commits to primary branch
- Researcher creates primary workspace associated with primary branch
- Researcher runs jobs in primary workspace
- Primary workspace directories are created on L3 and L4 filesystems
- Files are written to primary workspace directories
- Researcher writes paper based on primary branch
HEAD
and files in primary workspace directories - Researcher submits paper 🎉
At this point, the paper is based on primary branch HEAD
and files in primary workspace directories.
The paper is reviewed; further analysis is requested, which necessitates modifications to the dataset definition. The researcher doesn't want to overwrite files in primary workspace directories, because modifications to the dataset definition could result in a different dataset. 🙁
- Researcher branches from primary branch, giving secondary branch
- Researcher commits to secondary branch
- Researcher creates secondary workspace associated with secondary branch
- Researcher runs jobs in secondary workspace
- Secondary workspace directories are created on L3 and L4 filesystems
- Files are written to secondary workspace directories
- Researcher updates paper based on secondary branch
HEAD
and files in secondary workspace directories - Researcher merges secondary branch into primary branch
- Secondary branch is deleted (by researcher, by GitHub, etc.)
- Researcher submits paper 🎉
At this point, the paper is based on primary branch HEAD
and files in secondary workspace directories.
The paper is reviewed; further analysis is requested 🙁
-
Should the researcher commit to primary branch? Files in primary workspace directories are behind files in secondary workspace directories. The researcher would need to run jobs in primary workspace.
-
Should the researcher branch from primary branch, giving new secondary branch with same name as old secondary branch, and commit to new secondary branch? The researcher would not need to run jobs in secondary workspace.