Appending changesets to `Store` after opening causes overwriting
Describe the bug
-
let's say a
Storeinstance initially containschangeset1stored in its file. https://github.com/bitcoindevkit/bdk/blob/d99b3ef4b449a5c7b8facf04db3cb9be829a0ac1/crates/file_store/src/store.rs#L14 -
Then after opening the store instance using
Store::openand appending a new changeset (changeset2) usingappend_changeset, the store will only containchangeset2, notchangeset1 + changeset2.
https://github.com/bitcoindevkit/bdk/blob/d99b3ef4b449a5c7b8facf04db3cb9be829a0ac1/crates/file_store/src/store.rs#L73 https://github.com/bitcoindevkit/bdk/blob/d99b3ef4b449a5c7b8facf04db3cb9be829a0ac1/crates/file_store/src/store.rs#L162
To Reproduce
#[test]
fn test() {
let temp_dir = tempfile::tempdir().unwrap();
let file_path = temp_dir.path().join("db_file");
let changeset1 = BTreeSet::from(["hello".to_string(), "world".to_string()]);
let changeset2 = BTreeSet::from(["got the error".to_string()]);
{
// create new db
let mut db = Store::<TestChangeSet>::create_new(&TEST_MAGIC_BYTES, &file_path)
.expect("must create");
// append first changeset to db
db.append_changeset(&changeset1).expect("must succeed");
}
{
// open db
let mut db = Store::<TestChangeSet>::open_or_create_new(&TEST_MAGIC_BYTES, &file_path)
.expect("failed to load db");
// now append the second changeset
db.append_changeset(&changeset2).expect("must succeed");
// Retrieve stored changesets from the database
let stored_changesets = db
.aggregate_changesets()
.expect("must succeed")
.expect("must succeed");
// expected changeset must be changeset2 + changeset1
let mut expected_changeset = changeset2.clone();
expected_changeset.extend(changeset1);
// Assert that stored_changesets matches changeset2 but not expected_changeset
assert_eq!(stored_changesets, changeset2);
assert_ne!(stored_changesets, expected_changeset);
}
// Reason:
//
// On loading the pre-existing store instance,
// it contains all the pre-appended changesets but the file pointer is positioned back to 12
// i.e till the size of TEST_MAGIC_BYTES array.
// thus on appending any new changeset to db -> it overwrites those past changesets
// thus only contain the latest changeset.
// Open the store again to verify file pointer position at the end of the file
let mut db =
Store::<TestChangeSet>::open(&TEST_MAGIC_BYTES, &file_path).expect("failed to load db");
// get the current position of file pointer just after loading store
let current_pointer = db.db_file.stream_position().expect("must suceed");
assert_eq!(current_pointer, 12);
// end pointer for the loaded db
let expected_pointer = db.db_file.seek(SeekFrom::End(0)).expect("must succeed");
// both doesn't match
assert_ne!(current_pointer, expected_pointer);
}
Expected behavior
- On opening the
store-> its file pointer must be set to end.
Build environment
- BDK tag/commit: d99b3ef4b449a5c7b8facf04db3cb9be829a0ac1
- OS+version:
Fedora 37 - Rust/Cargo version:
rustc 1.81.0-nightly (f21554f7f 2024-06-08) - Rust/Cargo target:
nightly-2024-06-09-x86_64-unknown-linux-gnu
@evanlinjin could this happen with the changes in #1514?
Thanks for writing up this ticket.
The intended usecase of bdk_file_store is with BDK structures. With BDK, we never read after writing.
Maybe we can structure things to be more fool-proof. Need to think more on this.
Thanks @KnowWhoami. Maybe Store::open should always seek the file cursor to EOF? If this is truly a bug then I'm surprised all the tests and examples work how we expect using the file store
If this is truly a bug then I'm surprised all the tests and examples work how we expect using the file store
- while that's because in tests which considers opening the pre-existing Store -> we often run
aggregate_changesetsapi that internally calls theiter_changesetsapi's which iterates over the stored changesets in order to collect all the stored changesets but it also changes the file pointer of Store and thus moves it to the end of file.
https://github.com/bitcoindevkit/bdk/blob/d99b3ef4b449a5c7b8facf04db3cb9be829a0ac1/crates/file_store/src/store.rs#L135
https://github.com/bitcoindevkit/bdk/blob/d99b3ef4b449a5c7b8facf04db3cb9be829a0ac1/crates/file_store/src/store.rs#L238
- Thus , it works fine if we append any new changesets after then.
The intended usecase of bdk_file_store is with BDK structures. With BDK, we never read after writing. Maybe we can structure things to be more fool-proof. Need to think more on this.
-
Yeah that's true, BDK wallet Api's don't care much about internal working of database , they just need to read the db to get required changesets and append's new one if found any. That's it.
-
But since
bdk_file_storeis mainly developed for the downstream users so that they can directly used it in their BDK based wallets. So this bug should be taken into consideration.
I think it would help if we combine opening and loading into one read method that opens the file and returns the aggregate changeset, thereby ensuring the next write goes at the end of the file
I would like to work on this, but I've got some questions:
- There cannot be a situation where a user doesn't want to aggregate changesets?
- For any reason it's not desirable to store multiple changesets without aggregating them, like in the following succession?
1st changeset: open-append-close
2nd changeset: open-append-close
3rd changeset: open-append-close
I'm thinking in implementing something like the read method @ValuedMammal mentions above, but just using iter_changeset to verify the correctness of the stored changesets, without modifying them.
That would place the file pointer at the real end of the file, and check any existing changeset for write errors, leaving to the user the decision of aggregating changesets, or keep appending them until further decision.
Perhaps an easy fix is to change the file mode to append here instead of only write
https://github.com/bitcoindevkit/bdk/blob/2cf46a2a9e513d3f2612b7c363446fb1acf958ae/crates/file_store/src/store.rs#L73-L77
After quick testing this appears to fix the issue in the original post, however it caused other test failures, in particular last_write_is_short and write_after_short_read.
In particular, write_after_short_read is clearly proving what @evanlinjin stated above: never read after write . It asserts explicitly that the remaining changesets after the "short read" are overwritten.
This is the idea. I have tested it against current file store tests and a modification of the test above to check the expected behavior and it works.
Pros:
- Doesn't modify current API
Cons:
- Extra method
- Extra enum to wrap IterErrors inside FileError
diff --git a/crates/file_store/src/lib.rs b/crates/file_store/src/lib.rs
index 7c943ca2..236995e9 100644
--- a/crates/file_store/src/lib.rs
+++ b/crates/file_store/src/lib.rs
@@ -18,6 +18,8 @@ pub enum FileError {
Io(io::Error),
/// Magic bytes do not match what is expected.
InvalidMagicBytes { got: Vec<u8>, expected: Vec<u8> },
+ /// Something went wrong while decoding file content
+ WrongEncoding(entry_iter::IterError)
}
impl core::fmt::Display for FileError {
@@ -29,6 +31,7 @@ impl core::fmt::Display for FileError {
"file has invalid magic bytes: expected={:?} got={:?}",
expected, got,
),
+ Self::WrongEncoding(e) => e.fmt(f)
}
}
}
diff --git a/crates/file_store/src/store.rs b/crates/file_store/src/store.rs
index 62c3d91b..fdd7523c 100644
--- a/crates/file_store/src/store.rs
+++ b/crates/file_store/src/store.rs
@@ -92,6 +92,31 @@ where
})
}
+ /// Open [`Store`] and recover state after last write.
+ ///
+ /// Use [`open`] to open `Store` and manually handle the previous state if any.
+ /// Use [`create_new`] to create a new `Store`.
+ ///
+ /// # Errors
+ ///
+ /// If the prefixed bytes of the opened file does not match the provided `magic`, the
+ /// [`FileError::InvalidMagicBytes`] error variant will be returned.
+ ///
+ /// [`create_new`]: Store::create_new
+ pub fn rewind<P>(magic: &[u8], file_path: P) -> Result<Self, FileError>
+ where
+ P: AsRef<Path>,
+ {
+ let mut store = Self::open(magic, file_path)?;
+ for next_changeset in store.iter_changesets() {
+ if let Err(iter_error) = next_changeset {
+ return Err(FileError::WrongEncoding(iter_error));
+ }
+ }
+
+ Ok(store)
+ }
+
/// Attempt to open existing [`Store`] file; create it if the file is non-existent.
///
/// Internally, this calls either [`open`] or [`create_new`].
@nymius I do think it's reasonable to expect that the "open-append-close" flow should work without overwriting as you mentioned in https://github.com/bitcoindevkit/bdk/issues/1517#issuecomment-2366980177.
With BDK, we never read after writing.
^ To be honest I'm a little confused what is meant by this. edit: I may have been lacking context in which this is actually true; it makes sense over the lifetime of one Store instance.
If we assume the tests are correct, i.e. that append_changeset should write directly after the current file position (even if that means overwriting existing data), then I think the read approach makes sense (similar to your rewind idea). The issue I have with just adding this method (rewind) though is that it doesn't fix the problem with Store::open as pointed out by @KnowWhoami, and in the event the caller does want to load the contents of the database, the result is that we would effectively be doing iter_changesets twice?
On the other hand if we change the file mode to append, it would ensure that append_changeset always goes at the EOF regardless of the file position (if I understand), but we'd then have to redefine the test expectations with regard to truncating, etc
@nymius Revisiting the open-append-close idea:
Isn't it the case that we will always want to load the backend upon opening the file? For example in the context of a bdk wallet we have to load everything into memory in order to initialize the wallet. Without this you would have nothing to "append".
Isn't it the case that we will always want to load the backend upon opening the file? For example in the context of a bdk wallet we have to load everything into memory in order to initialize the wallet. Without this you would have nothing to "append".
I agree. Unless there was a precondition forcing a re scan on each open (which wasn't documented), I think is the most logic thing to do.
I did a summary of possible solutions I thought about:
-
Another method like
rewindorreadimplementing the safe behavior of reading content before proceeding with any writes.I think it works "on paper" but doesn't solve the underlying problem with open. It's just a workaround for those expecting the functionality explained in the issue.
-
Iterating changesets on
Storeopening.This also works partially, as any encoding error causes the failure of the method and doesn't recover the already parsed changesets as the
AggregatedChangesetErrorallows. -
Changing the opening mode of the file to
append:Causes
store::test::append_changeset_truncates_invalid_bytesto fail because it won't check the sanity of the previous data before allowingappend_changesetto write after it, causing any failing file store to keep breaking store unnoticingly. -
Iterating changeset on
Store::opencall but silently truncating the file to the last successfully read changeset if any error occurs:Works, only failing test is
store::test::write_after_short_readbut the problem is on the tests itself, as it expects the file pointer to be reset each time the file is open. I don't like the UX of this. The user will be losing information without noticing. -
Iterating changeset on
Store::opencall but silently truncating the file to the last successfully read changeset if any error occurs, but returning the index of the changeset causing the failure.Same as above, but improving the UX, now the user knows there is a problem with its changeset store and can decide to do a rescan. Also, we could implement a complementary method to merge changesets until some changeset index, allowing users to work in the meantime with the not corrupted data. Not sure about how useful is this.
-
Aggregating changeset on each
Storeopen by default:Should works the same as above but returning a partially parsed changeset. It would also aggregated any changeset in the file, which may not be desired by some users, or at least we should check is not a current use case for someone. The difficulty with this change is it requires the addition of a new variant to the
FileErrorenum, and as this variant is wrapping theAggregateChangesetError, it would require a generic for the partially aggregated changeset, creating a domino effect of complexity, making it a difficult change. -
Aggregating changeset on each
Storeopen by default but just returning the error of theAggregatedChangesetsErrorand ignoring the partially aggregated changeset:This is an evolution of the solution above, avoiding the complexity of the generic implied by the partial aggregation.
I've tried all of the above. The majority of them make the following test fail, except for 1 and 4:
-
store::test::append_changeset_truncates_invalid_bytes: a prematureiter_changesetin the open method realizesthe code is wrongly formatted but doesn't return the partially read changesets, as it's expected. -
store::test::last_write_is_short: a prematureiter_changesetwill fail if the file has been truncated in the middle of a changeset. The expected behavior is to open it normally but fail to aggregate and return the partial aggregated changeset inside the error. -
store::test::write_after_short_read: fails because the expected behavior is to not update the file pointer on open, and just overwrite whatever were there before. Instead, the open reads the changesets stored, updating the file pointer and then proceeds to append any new changeset, effectively avoiding the lose of the previous changesets.
The solutions I prefer is 5 and the second one is 7.
Why 5 over 7? Because 5 gives additional information about the failing changeset, allows a partial recovery and also informs of the error, while 7 just informs the error without giving the previous alternatives.
It should be relatively painless if we keep all the same semantics of Store and build a safer API around it, so I want to go back to your first suggestion and add a method (call it reopen) that internally does iter_changesets to check the integrity of the db before returning a new Store instance. We can deprecate the current open and in the future we'll still use it internally, it just won't be marked pub. These are suggestions you can feel free to adapt.
/// Reopen an existing [`Store`]. Errors if...
pub fn reopen<P>(magic: &[u8], file_path: P) -> Result<Self, OpenError> where P: AsRef<Path> { ... }
We can have a new error OpenError that wraps the IterError. If we want to add a new and improved open_or_new for convenience, then I guess OpenError will have to be an enum with a variant for FileError as well.
/// An error while opening or creating the file store
#[derive(Debug)]
pub enum OpenError {
/// entry iter error
EntryIter {
/// index that caused the error
index: usize,
/// iter error
iter: IterError,
},
/// file error
File(FileError),
}
We should be able to keep the existing tests and add coverage for the new functionality fixing the issue. If there's no interest in going the #[deprecated] route, then the suggestion is to make the current open a private method now and replace it with what I'm calling reopen above.