content: require backing store for checkpoint
Problem: A backing store is required for content.flush but it is not required for content.checkpoint-put. This is inconsistent and can lead to checkpointing problems done the line.
Require content.checkpoint-put to only work if there is a backing store available. As a consequence, remove code that handled "cached" checkpoints when a backing store is not available.
rebased and re-pushed, just didn't want this series of fixes to be forgotten :-) (along w/ #6240 and #6260)
I guess one caveat to this change is that, if running without a backing store, you could no longer reload the KVS module without losing all the data. Do we care about that?
I guess one caveat to this change is that, if running without a backing store, you could no longer reload the KVS module without losing all the data. Do we care about that?
I think it's ok?
The reason for this fix was simply the inconsistency. a content-flush + checkpoint-put is (will always be?) a combination to be done. One of those can work without a backing store right now, but the other one can't. So it sort of doesn't make sense.
So presumably the alternative would be to try and support content-flush and checkpoint-put if a backing store isn't loaded?
So what happens if you call flux_kvs_commit(FLUX_KVS_SYNC) or kvs_checkpoint_commit() without a backing store?
I'm open to requiring a backing store if that cleans things up a bit, e.g. makes those functions work as they should for all configurations. I'm not sure we found there was much benefit to running without one and if I recall we have the ability to tell content-sqlite to run from memory only.
So what happens if you call flux_kvs_commit(FLUX_KVS_SYNC) or kvs_checkpoint_commit() without a backing store?
FLUX_KVS_SYNC will fail, b/c content.flush fails when there is no backing store. A FLUX_KVS_SYNC is a content.flush followed by a checkpoint_commit().
kvs_checkpoint_commit() right now can work w/o a backing store, basically caching the checkpoint in memory.
The inconsistency between content.flush and checkpoint-put is what started this and the follow up PRs. Basically we want to always content.flush + checkpoint-put, yet one will fail without a backing module and the other won't. In addition, I think that checkpoint-put not requiring a backing store is sort of the opposite of what you would think checkpoint-put does.
Let's get this in and we can decide where we want to go from here.
Sounds good to me.
For those following along, we've already had some thoughts / discussion on this #6630
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 83.83%. Comparing base (
b9ea17a) to head (17ff530). Report is 238 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #6255 +/- ##
=======================================
Coverage 83.83% 83.83%
=======================================
Files 535 535
Lines 89397 89313 -84
=======================================
- Hits 74942 74873 -69
+ Misses 14455 14440 -15
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/modules/content/cache.c | 85.43% <ø> (-0.03%) |
:arrow_down: |
| src/modules/content/checkpoint.c | 77.86% <100.00%> (-1.16%) |
:arrow_down: |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.