centraldogma
centraldogma copied to clipboard
Provide a way to remove old commits
Motivation: Central Dogma uses jGit to store data. Due to the nature of Git that stores unlimited history, Central Dogma will eventually get in trouble managing disk usage. We can handle this by maintaining the primary and secondary Git repositories internally. This works in this way:
- Commits are pushed to the primary Git repository.
- If the number of commits exceeds the threshold (
minRetentionCommits
), then the secondary Git repository is created. - Commits are pushed to both primary and secondary Git repositories.
- If the secondary Git repository has the number of commits more than the threshold;
- The secondary Git repository is promoted to the primary Git repository.
- The primary Git repository is removed completely.
- Another secondary Git repository is created.
- Back to the third.
Modifications:
- Add
CreateRollingRepositoryCommand
that creates the rolling repository by theCommitRetentionManagementPlugin
. - Add
GitRepositoryV2
that manages the rolling jGit repositories.- The name of internal repositories will be:
foo_0000000000
,foo_0000000001
,foo_0000000002
and so on- So that we can only store the suffix of the primary repo to maintain the repository. (i.e. We don't have to maintain the directory information of the secondary repo.)
-
RepositoryMetadataDatabase
has the suffix in its file database.
-
GitRepository
is not removed for the migration test.
- The name of internal repositories will be:
- Add
InternalRepository
that has jGit repository andCommitIdDatabase
. - What happens if the revision of an operation(such as
diff
,watch
,history
, etc.) is lower than thefirstRevision
of the current primary repo?- If
Revision.INIT(1)
is used, thefirstRevision
is used instead.- e.g.
diff(INIT, new Revision(100) ...)
is equals todiff(new Revision(firstRevisionNumber), new Revision(100) ...)
- e.g.
- If the
Revision
betweenRevision.INIT(1)
and thefirstRevision
is used, aRevisionNotFoundException
is raised.- except
watch
andfindLatestRevision
.
- except
- If
Result:
- Close #575
- A repository now removes the commits that are older than
minRetentionCommits
while keeping the commits made in the recentminRetentionDay
.
Todo:
- Provide a way to set
minRetentionCommits
andminRetentionDay
for each repository. - Unused internal repositories are removed by purging service.
- Support mirroring from CD to external Git.
Codecov Report
Patch coverage: 70.19%
and project coverage change: -2.17%
:warning:
Comparison is base (
079daeb
) 65.69% compared to head (84cb6cd
) 63.53%.
Additional details and impacted files
@@ Coverage Diff @@
## main #681 +/- ##
============================================
- Coverage 65.69% 63.53% -2.17%
- Complexity 3351 3473 +122
============================================
Files 358 366 +8
Lines 13893 15093 +1200
Branches 1496 1687 +191
============================================
+ Hits 9127 9589 +462
- Misses 3916 4591 +675
- Partials 850 913 +63
Files Changed | Coverage Δ | |
---|---|---|
...ecorp/centraldogma/server/CentralDogmaBuilder.java | 53.38% <0.00%> (-0.82%) |
:arrow_down: |
...orp/centraldogma/server/CommitRetentionConfig.java | 0.00% <0.00%> (ø) |
|
...p/centraldogma/server/command/AbstractCommand.java | 47.61% <ø> (ø) |
|
.../linecorp/centraldogma/server/command/Command.java | 74.35% <0.00%> (-13.52%) |
:arrow_down: |
...server/command/CreateRollingRepositoryCommand.java | 0.00% <0.00%> (ø) |
|
...ogma/server/command/StandaloneCommandExecutor.java | 82.25% <0.00%> (-3.46%) |
:arrow_down: |
...internal/storage/DirectoryBasedStorageManager.java | 63.31% <ø> (ø) |
|
...ver/internal/storage/repository/CacheableCall.java | 65.21% <ø> (ø) |
|
...internal/storage/repository/RepositoryWrapper.java | 22.85% <0.00%> (-2.15%) |
:arrow_down: |
...al/storage/repository/cache/CachingRepository.java | 92.36% <0.00%> (-2.95%) |
:arrow_down: |
... and 16 more |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
If
Revision.INIT(1)
is used, thefirstRevision
is used instead.
This somewhat sounds like a surprising behavior to me. This means a diff command that once worked as expected starts to return completely different result once rolling occurs.
Can we instead treat any revision between INIT
(inclusive) and firstRevision
(exclusive) as an 'empty' state where the repository contains no files at all?
This issue seems to be a necessary feature to keep the cluster running reliably.
@aidanbae Sorry for making you wait for this. I'm working on it now. Please stay tuned.
Can we instead treat any revision between INIT (inclusive) and firstRevision (exclusive) as an 'empty' state where the repository contains no files at all?
That makes sense. Let me apply that approach.
Can we instead treat any revision between INIT (inclusive) and firstRevision (exclusive) as an 'empty' state where the repository contains no files at all?
Had a chat with the team, and we decided to raise an exception instead of treating it as an empty state.
This PR is ready, so PTAL. 😉