bdk icon indicating copy to clipboard operation
bdk copied to clipboard

refactor(chain)! removed IndexedTxGraph 🗑

Open LLFourn opened this issue 1 year ago • 3 comments

Fixes https://github.com/bitcoindevkit/bdk/issues/1147

Removes IndexedTxGraph in favour of adding a type parameter to TxGraph. When the second type parameter X: Indexer is set then TxGraph behaves like IndexedTxGraph used to. This simplifies code and improves ergonomics.

The motivation for doing this now this is because doing this later would be a breaking change for bdk_wallet and I felt it was irritating to work with.

I reworked the internals of TxGraph as I thought things were a bit convoluted.

Notes to the reviewers

Changelog notice

  • Remove IndexedTxGraph in favour of adding a type parameter to TxGraph.

Checklists

All Submissions:

  • [x] I've signed all my commits
  • [x] I followed the contribution guidelines
  • [x] I ran cargo fmt and cargo clippy before committing
  • [x] I'm linking the issue being fixed by this PR

LLFourn avatar Jul 11 '24 06:07 LLFourn

Concept ACK

ValuedMammal avatar Jul 11 '24 14:07 ValuedMammal

I added this to the beta milestone but TBH I'm worried about trying to get another big refactor in on top of the other refactoring going on.

notmandatory avatar Jul 11 '24 15:07 notmandatory

Fine to de-prioritize this one. It's here for when it's useful.

LLFourn avatar Jul 12 '24 01:07 LLFourn

I'm interested in updating this PR if @LLFourn hasn't gotten to it.

ValuedMammal avatar Feb 03 '25 13:02 ValuedMammal

In this PR tx_graph::ChangeSet gains a indexer field and a generic type which defaults to (). Would it violate backwards compatibility for the wallet changeset (below) to simply adopt the new type which is otherwise identical? Or does it necessitate keeping the old tx graph changeset and provide an upgrade path, say from v0 to v1?

https://github.com/bitcoindevkit/bdk/blob/88330f603cb415d01c88f1a579f20a21cb8c1658/crates/wallet/src/wallet/changeset.rs#L11-L24

ValuedMammal avatar Feb 03 '25 14:02 ValuedMammal

In this PR tx_graph::ChangeSet gains a indexer field and a generic type which defaults to (). Would it violate backwards compatibility for the wallet changeset (below) to simply adopt the new type which is otherwise identical? Or does it necessitate keeping the old tx graph changeset and provide an upgrade path, say from v0 to v1?

I think the new tx_graph::ChangeSet can be considered backward compatible because although it gains an extra field, the type itself implements Default.

edit: After talking on dev call we decided that, in the interest of not breaking the data model, it would be best to maintain tx_graph::ChangeSet<A> as it exists today - noting that interoperability between this type and the new changeset should be fairly trivial to achieve.

ValuedMammal avatar Feb 04 '25 14:02 ValuedMammal

@ValuedMammal are you suggesting that the new tx_graph::ChangeSet should have the same signature as the current indexed_tx_graph::ChangeSet?

I'm okay with this, but I still don't understand how this is better than having the indexer changeset as a new field in tx_graph::ChangeSet. Would you be about to let us know what was concluded in the call? Thanks.

evanlinjin avatar Feb 06 '25 08:02 evanlinjin

@ValuedMammal are you suggesting that the new tx_graph::ChangeSet should have the same signature as the current indexed_tx_graph::ChangeSet?

Not exactly. Assuming the new tx graph ChangeSet is defined as in this PR, how will that affect downstream users such as bdk_wallet who still depend the old tx graph changeset? My suggestion was to keep both types to maintain backwards compatibility.

If I understand, @tnull was of the opinion that making the wallet's changeset automatically become tx_graph::ChangeSet<ConfirmationBlockTime, ()> would violate the backward compatibility guarantee

ValuedMammal avatar Feb 06 '25 14:02 ValuedMammal

@ValuedMammal I'm not sure which "data model" we are referring to here. The one we support is sqlite where this is backwards compatible and transparent change. Are we talking about serde encodings? If so I actually like @evanlinjin's interpretation of what he thought you were saying which is to keep the new ChangeSet structurally exactly like the old IndexedTxGraph changeset. The only problem is to name them.

// tx_graph.rs
// the old indexed tx graph changeset
struct ChangeSet<A, I> {
    #[serde(alias="tx_graph")] // or whatever it was called 
    pub txs: TxChangeSet<A>, // the old tx graph changeset
    pub indexer: I
}

But if we are serious about keeping serde encodings compatible we should really think carefully about how to do that (perhaps by using a lot of #[serde(default)].

LLFourn avatar Feb 07 '25 08:02 LLFourn

Ok it makes sense to use the same structure as the original indexed_tx_graph::ChangeSet @LLFourn @evanlinjin

ValuedMammal avatar Feb 08 '25 17:02 ValuedMammal

I'm not sure which "data model" we are referring to here. The one we support is sqlite where this is backwards compatible and transparent change. Are we talking about serde encodings?

No, this is about maintaining at least backwards compatibility of the bdk_wallet::ChangeSet type and all its sub-types (note that we previously also discussed introducing support for forwards compatibility also, which we'd still love to support eventually).

I'm not sure I'm completely following the different changes proposed in this PR, but what was discussed on the recent call mentioned that fields would be moved between sub-types, which would be a breaking change, AFAIU.

tnull avatar Feb 10 '25 09:02 tnull

No, this is about maintaining at least backwards compatibility of the bdk_wallet::ChangeSet type and all its sub-types (note that we previously also discussed introducing support for forwards compatibility also, which we'd still love to support eventually).

We are making breaking changes to bdk_wallet's ChangeSet type in the next major version release. This is going to be part of that release. Even across major versions the sqlite persistence will be backwards compatible and it looks like the serde encoding too (likely codec specific). FWIW, The much more significant breaking change is bitcoindevkit/bdk#1811.

LLFourn avatar Feb 24 '25 00:02 LLFourn

I'm going to rebase this on master and remove the changes to bdk_wallet.

evanlinjin avatar Mar 06 '25 06:03 evanlinjin

We are making breaking changes to bdk_wallet's ChangeSet type in the next major version release. This is going to be part of that release. Even across major versions the sqlite persistence will be backwards compatible and it looks like the serde encoding too (likely codec specific). FWIW, The much more significant breaking change is bitcoindevkit/bdk#1811.

Excuse the delay here, I was off for a bit. FWIW, when we started to implement our serialization, I was ensured over and over again that custom serialization is fully supported by the BDK project and that ChangeSet and subtypes would only ever change in backwards compatible fashion. It seems this policy has been dropped (?) by now already, so tbh. I'm really starting to get worried that an upcoming BDK release will simply break compat for us entirely, which is a no-go.

I therefore think it's crucial to finally clearly document what kinds of changes can be expected, what is and will be guaranteed going forward by the ChangeSet APIs, and what users implementing their own serialization can reliably lean on. Right now I fear that we might otherwise end up with unreconcilable breaking changes.

And, while I generally try to keep an eye on what's happening in BDK and appreciate the pings on/pointers to particular PRs, it can't be expected that I review every PR for changes that might potentially break the previously agreed upon policy. The guarantees really need to be written down, and preferably even asserted via upgrade tests in CI to make they are not violated.

tnull avatar Mar 06 '25 08:03 tnull

@tnull Changeset serialization format is definitely going to be backwards compatible. In terms of Cargo semver, these are breaking changes.

evanlinjin avatar Mar 07 '25 00:03 evanlinjin

@tnull Changeset serialization format is definitely going to be backwards compatible. In terms of Cargo semver, these are breaking changes.

Hmm, okay, still confused what you mean by 'ChangeSet serialization format' here, as BDK doesn't provide (a canonical) one. Would be great to see these guarantees documented (i.e., spelled out in detail what exactly they mean) and tested. I'll shut up regarding this now.

tnull avatar Mar 07 '25 09:03 tnull

cACK 872fa4a3b1ae7ec3c6559b1083ddd35a10d2641c

nymius avatar Mar 10 '25 22:03 nymius

@tnull are you able to be specific with what you want?

I.e. it will be much more productive if you told us how you did serialization in LDK Node if it worries you that serialization would break here.

Serialization format is not changing at all in this PR. We never agreed on having a canonical serialization format for BDK (and I don't understand how it would help). We agreed that we will only add new and default-able fields on changesets (with serde(default)). This policy will work with all serialization formats that have delimiters.

evanlinjin avatar Mar 13 '25 00:03 evanlinjin

We had a discovery which lead to discussion about whether this is the right approach. For silent payments you need to change the Indexer trait to take in a &TxGraph so the indexer can look up its prevouts (which are needed to scan the transaction). Passing in a TxGraph<A, X> into the Indexer::index_tx method would be extremely awkward if Indexer was inside TxGraph<A, X> as the X parameter.

Removing IndexedTxGraph seemed like a no brainer zero cost clean up. But now we've discovered that there is a cost I think we should hold off on this. There are other ways to refactor things but I think the best way forward is to keep things as they are and reduce clutter through a macro that duplicates every not &mut method on TxGraph to IndexedTxGraph.

I'm closing this for now. Thanks for trying to keep this up to date @ValuedMammal. Feel free to reopen this if you disagree or have another idea.

LLFourn avatar Mar 14 '25 00:03 LLFourn

Excuse the delay here, I was off for a bit. FWIW, when we started to implement our serialization, I was ensured over and over again that custom serialization is fully supported by the BDK project and that ChangeSet and subtypes would only ever change in backwards compatible fashion

I'm sorry for the confusion wrt to this. It's possible we did change our guidance on this. We may need to make changes to ChangeSet types over time. Hopefully we get to a point where we can confidently say that these ChangeSets will never ever change in any way but we are not there yet.

LLFourn avatar Mar 14 '25 00:03 LLFourn

@tnull are you able to be specific with what you want?

Well, what I stated multiple times previously: clear, documented guidance on what guarantees implementors can lean on. So far, all the compatibility guarantees were committed to on Discord, or in live meetings, but it turns out there is increasing confusion on what they actually mean and whether I can expect to be able to upgrade to the next BDK version or not. 🤷‍♂️

I.e. it will be much more productive if you told us how you did serialization in LDK Node if it worries you that serialization would break here.

Well, I'm happy to point you to the code (1, 2), but that is also not how this should work: there should be an API contract clearly communicated and committed to by the BDK project. So neither do I expect you guys to read our code, nor do I want to constantly review BDK PRs just to check it doesn't break our stuff (to be clear, I'm happy to contribute some review here and there, but not for this reason).

Serialization format is not changing at all in this PR. We never agreed on having a canonical serialization format for BDK (and I don't understand how it would help). We agreed that we will only add new and default-able fields on changesets (with serde(default)). This policy will work with all serialization formats that have delimiters.

Well IIRC you agreed to a) only make backwards compatible changes (and solved it by implementing Default), b) have the individual sub-ChangeSets being persistable separately in a backwards compatible fashion and c) further look into making this forwards compatible eventually. Most of this can be found somewhere on Github and Discord, but that's really not the point, and not how such a central API should work.

@evanlinjin Note that the guarantees you just reiterated in your own words seem to be in conflict with what @LLFourn just summarized:

I'm sorry for the confusion wrt to this. It's possible we did change our guidance on this. We may need to make changes to ChangeSet types over time. Hopefully we get to a point where we can confidently say that these ChangeSets will never ever change in any way but we are not there yet.

So, to be honest, I basically read this as "all bets are off, we currently just care about whether or not our SQLite persistence breaks or not, nevermind what we said a few months back, please come back once we're done making further breaking changes". Please, if I'm misinterpreting this, let us define a clear, documented API contract that defines compatibility guarantees.

It could be that everything is fine and I'm worrying too much, but as the confusion on all sides seems to be increasing, I fear that one day I won't be able to upgrade BDK without writing tedious migration scripts, or, much worse, have the user discover in production that their serialized data doesn't work with the newest version of BDK.

tnull avatar Mar 14 '25 09:03 tnull