feat: Cache revision list for repository trees #WIP
We need to change this to handle out of order tree insertion. That means both append and prepend need to work, and we're going to use a signed order to manage the state. You WILL always need an existing hash in the RepositoryTree (or an empty tree) to insert, but order will no longer matter. This means that import_repo doesn't need to reverse references, but sync_repo likely will need tweaked.
Need tests for various outlier coverage yet:
- Ensure prepend/append dont wipe the other sides of the tree (e.g. the order/delete query is correct)
- Both need full coverage for parent arg (existing), parent arg (not existing), no parent arg
Feel confident with the code (needs more tests yet), so going to move onto import/sync.
One challenge I have right now is importing all trees. I might just ignore any that are not default_branch. Then the question is: how do we import a tree? I dont think there's a good way with the current import mechanism as it tries to import all commits. We likely need separate import_tree/sync_tree tasks which operates similar to the repository tasks. The caveat here is we have foreignkeys to revisions, so we cant actually update the tree until the revisions exist.
One option is by hijacking RevisionResult.save() and using branches as a mechanism to track the trees. That's not going to be overly efficient unfortunately. We might still be able to leverage that data and just batch trees in import_repo. This would likely collect every tree that we've annotated from one call to vcs.log and call RepositoryTree.prepend_tree for each reference. I think order is guaranteed here, so there's no concern about chunking revisions incorrectly.
Lastly, we likely will want to implement RepositoryTree.log(branch) with the expectation that it outputs identical results to vcs.log(branch)
Can't do what I was hoping with import_repo as we need to have the parent_sha to confirm the tree is correct.
I think instead we'll import all of the revisions, and then import default_branch (using separate tasks).
Also need to sort out copying trees as otherwise this will get super expensive whenever a new branch is pushed up.