rippled
rippled copied to clipboard
Use strongly-typed AcctRoot class to internally represent account root ledger objects
High Level Overview of Change
Up to this point the server's internal code has presented every ledger object type to programmers as a
std::shared_ptr<STLedgerEntry>or as astd::shared_ptr<STLedgerEntry const>.
The choice of representation was based on whether or not the returned object was mutable.
That's fine, and it has gotten us a long way. But it has some limitations.
The different types of objects in the ledger are in fact, different. We use the same type, a std::shared_ptr<STLedgerEntry>, to represent an account root, a directory node, and a trust line (RippleState). These things are very different, and programmers think about them them differently. But we're treating all three of these types (and many more) as identical with the C++ type system.
It doesn't have to be like that, and this pull request is a step along that journey.
Context of Change
We've been considering ways to have ledger objects take up less space in the ledger. But many the changes that these space reductions would entail are hard to manage given the ad hoc way that we handle ledger entry objects. It would be nice to put a uniform face on, say, a TrustLine. Then any funny stuff we do to save space can be hidden behind that uniform face. We don't have to pollute every individual piece of code that operates on TrustLine with the funny stuff. We can hide any funny stuff behind the uniform face.
So that's the motivation for the change. But it's not the only benefit. If an account root has the type AcctRoot, and we pass a reference to it, then the receiving end knows at compile time that they got an AcctRoot. As it stands today the receiving end gets a std::shared_ptr<STLedgerEntry> which could be an account root or a signer list or a ticket or an escrow or something else altogether.
Techniques
To achieve this end we'll use the following techniques:
- We'll make our
Keylets strongly typed. So each kind ofKeyletis a distinct type that knows exactly which type it is supposed to retrieve from the ledger at compile time. - These strongly typed
Keylets will allow theAppyVieworReadViewto know which ledger object type needs to be returned. - We'll use templates to reduce the boiler plate involved in making strongly typed
Keylets and ledger object types.
Steps
This pull request contains early stages toward making all ledger objects strongly typed. But it does not finish the job. This pull request has three separate commits, each of which takes us one step along the way:
-
The first commit introduces a base class to
Keylet. This gives us a place to put common behavior that is shared between all of the strongly typedKeylets. -
The second commit renames the
ReadView::read()method toReadView::readSLE()andApplyView::peek()method toApplyView::peekSLE(). This rename is propagated to the derived classes as well. The renamed methods will continue to return serialized ledger entries. This is in anticipation of having futureread()andpeek()methods that return strongly typed objects. -
The third commit introduces the first strongly typed
Keyletand its corresponding strongly typed ledger objectAcctRoot. The account root was chosen as the first type because it is used a lot and should provide a good test ground for the methodology.
The first two commits allow the transition to using strongly-typed ledger objects to be incremental. The third commit is just the first of many. Future commits will convert more ledger objects to be strongly typed.
Type of Change
- [x] Refactor (non-breaking change that only restructures code)
Before / After
Before:
std::shared_ptr<SLE const> sleAcct = view.read(keylet::account(acctID));
if (!sleAcct)
return terNO_ACCOUNT;
SeqProxy const acctSeqProx = SeqProxy::sequence((*sleAccount)[sfSequence]);
After:
std::optional<AcctRoot> acctRoot = view.read(keylet::account(acctID));
if (!acctRoot)
return terNO_ACCOUNT;
SeqProxy const acctSeqProx = SeqProxy::sequence(acctRoot->sequence());
Test Plan
I looked at the code coverage of these changes based on existing unit tests. I was surprised and pleased at how thoroughly these refactoring changes were exercised by pre-existing unit tests. As a result I felt no inclination to introduce additional tests. Reviewers may feel differently.
Future Tasks
These are only the first few steps of this project. In order to have a coherent code base it will be important to complete the conversion of all other ledger types to strongly typed. I anticipate that each newly introduced strongly typed ledger object will arrive in a separate pull request.
I had a quick look at the changes, and I fully agree with the direction they are taking us towards.
One thing that I found immediately surprising was view.read() returning std::optional<AcctRoot>, which looks like a value, so I wondered if there was any copying?
There is not, because AcctRoot is a pointer in disguise, being a LedgerEntryWrapper which itself encapsulates a std::shared_ptr.
I think developers will have to be aware of that, so I think it would be more straightfworward to have view.read() return the AcctRoot (instead of std::optional<AcctRoot>), and AcctRoot could have an operator bool() which would just return LedgerEntryWrapper::wrapped_::operator bool().
Rebased to 1.10.0-b1.
@seelabs :
Did a pass. I'll come back and do another pass later, but I wanted to get the current set of comments in now.
Are you still planning to come back and do another pass later?
@intelliot, In addition to code reviews, there's an additional concern regarding this pull request.
I think this pull request starts us down a good road in terms of compile-time coding safety for transactors. It also provides some scaffolding that is likely to be useful if we work on reducing the in-ledger storage size of particular node types. However this pull request is only a start down the road.
In order to be truly effective a similar conversion to strong types needs to happen for most, if not all, in-ledger object types. This includes at least:
- Offers,
- RippleStates (trust lines),
- Escrows,
- Tickets,
- SignerLists,
- PayChannels,
- Checks, and
- NFTokenOffers.
It may also include directory-like things, like
- DirectoryNodes and
- NFTokenPages.
In my opinion if we start down this road, then we need to commit to finishing the work. See https://youtu.be/B2XtqVZcSdM?t=609 (that whole video is worth watching, by the way).
Let's not start down this path unless we're committed to finishing the refactor to include all of the appropriate ledger objects. As David Sankel says in the linked video, "A partial change is destruction of value."
This beginning refactor shows that the approach can be successful. I chose to start with AccountRoots because they are so ubiquitous and used in so many ways. We now know that this approach works. But if we start the refactor and don't finish it, then we are just adding a new mess to the code base.
So accepting this change into the code base means future resources must be committed to completing the refactor. If that commitment can't be made, then this branch should not be merged.
Very important point, @scottschurr ! And I really enjoyed the video too :-).
a similar conversion to strong types needs to happen for most, if not all, in-ledger object types.
Also the ones not merged yet like the sidechain ones.
So accepting this change into the code base means future resources must be committed to completing the refactor. If that commitment can't be made, then this branch should not be merged.
How about merging to a feature branch similar to the one that was just created for nft-fixes? There will be maintenance, and a risk of bit rot in that feature branch, but it'll make it easier to forget the whole thing if we don't get the follow through.
@ximinez, working with a feature branch would be possible. However I have concerns:
- Converting all of the ledger types will touch a lot of stuff. Review of the final branch will be close to impossible because so much will be touched. Managing the bit rot will also be extremely challenging.
- Suppose we started on that branch, got 90% of the way through, and then gave up. That's a huge amount of effort to just throw away. Would we have the will to throw it out? Or would we submit the mostly-completed work anyway?
All I'm saying is that if we don't have the will to complete this work then we should not merge just this first part. I don't think that working on a feature branch avoids that decision. That same decision is needed whether we merge this first part or continue the work using a feature branch.
@ximinez @scottschurr What's your opinion? Perhaps, after taking a step back and considering all the effort required (including reviews), we've learned that this refactor is not worthwhile at this time?
From @intelliot:
What's your opinion? Perhaps, after taking a step back and considering all the effort required (including reviews), we've learned that this refactor is not worthwhile at this time?
Personally, I think the effort is quite worthwhile. If we can't do this kind of code maintenance then the code base will fall into disarray over time. I think this is a worthwhile change.
However I'm not in charge of where people spend development time. If big features and code reviews will soak up all the time then we should not embark on significant maintenance efforts like this one.
@ximinez @scottschurr What's your opinion? Perhaps, after taking a step back and considering all the effort required (including reviews), we've learned that this refactor is not worthwhile at this time?
I concur with @scottschurr. Even though I haven't done a deep dive into this particular PR, the concept is good, and this is the kind of investment that will pay off with higher code quality for years to come.
- need estimate of how much time this would take
- need recommended prioritization vs OKRs and goals
- want the interface across all objects to be uniform. This PR only changes AcctRoot, but there are many other types of ledger objects.
- to have a consistent interface throughout the codebase, every type of ledger object would need to be changed.
- 2 years ago, this was considered viable, but things have changed since then.
- this is purely a refactor / maintenance with no effect on users - and is fairly expensive.
- would be counterproductive to merge this PR without following through to update all other object types.
- potential end-user benefit: enabling future reduction in trust line size
@seelabs @ximinez to consider whether it makes sense to keep this alive - balancing the value of this work with other priorities
As far as I can tell there are no active code reviews running. So I've rebased this pull request in order to allow it to be built by conan.
@ckeshava identified some places that the AcctRoot was not being used but could be. The most recent commit incorporates @ckeshava's suggested changes.
The most recent commit is a change suggested by @ckeshava.
I've got an alternative design in branch acctroot in my fork. Here's what I like about it:
ReadView::read, ApplyView::peek, and their callers are dealing directly in std::shared_ptrs again. No optional<AcctRootRd>. No Writable template parameter to approximate constness. Just familiar const and non-const. AcctRootImpl inherits directly from STLedgerEntry. No LedgerEntryWrapper. Helper methods are moved to STLedgerEntry. I believe that modulo renamed variables, this design has fewer differences with the starting code, meaning fewer changes are needed to adapt it.
I don't like that construction of STLedgerEntrys is handled through a virtual method of KeyletBase. That is not required, though, just the fastest way I could get to this proof-of-concept. I'd like to see the keylet class hierarchy rearranged:
- Rename
KeyletBaseback toKeylet. RenameKeyletto something likeGenericKeylet. - Overload
ReadView::readandApplyView::peekforGenericKeyletvs what will soon be a growing set of strongly-typed keylets, includingAccounttRootKeylet[^1]. - Make
read(GenericKeylet)return astd::shared_ptr<SLE>while all the other overloads can returnstd::shared_ptrs to derived types.
Builds and all tests pass on my machine. CI is still in-progress as I write this comment.
[^1]: Why is it named AccountRootKeylet but AcctRootImpl, AcctRoot, and AcctRootRd? I would like "Account" spelled out in every name.
I looked over @TheJohnFreeman's alternative design. I spotted what I consider to be a big flaw and a smaller issue.
-
Each wrapper is an
STLedgerEntry. So the wrapper no longer makes it difficult to use theSTObjectinterface. Each wrapper still leaves that interface fully available. So a good portion of the error prevention that the wrappers provide evaporates. The safer interface provided by the wrappers is available, but it is not encouraged. This makes the wrappers kind of pointless from my perspective. -
Ledger::readSLE()andLedger::peekSLE()return ashared_ptr<SLE>, andSLEis the base class. This forces ReadView::read() and ApplyView::peek() to do downcasts. Not the end of the world, but not desirable.
So the next question is what can be done about these objections.
-
Objection 1 could be addressed by using private inheritance. But that removes the availability of the public interfaces that were moved from
LedgerEntryWrapperintoSTLedgerEntry. Also (I think) using private inheritance makes it harder to return anSTLedgerEntryfrom one of the wrappers. This is a back door that is necessary some times. -
Objection 1 could also be addressed by going back to using aggregation, rather than inheritance. This would be much like the old wrappers, but rather than wrapping a
std::shared_ptr<SLE>the wrapper would just wrap aSLE. But thenLedger::readSLE()andLedger::peekSLE()would need to return different types depending on theKeylet. That problem might be solvable by adding toLedger...
template <class K, typename T = typename K::sle_type>
auto
typedPeek(K const& keylet) -> std::shared_ptr<T>
and similarly for typedRead(). I'm not sure. I haven't tried it. And you can't call them just peek()/read(), because that would hide names. But providing Ledger::typedPeek and Ledger::typedRead would also address Objection 2 (assuming it works).
Or folks could say that Objection 1 is not worth worrying about. Personally, I think it's a very serious drawback, but mine is not the only voice.
If folks agree that Objection 1 is worth addressing then I could make a run at implementing the second of the two approaches described above. If that doesn't pan out then I'd be leaning toward the original proposal, template<bool> and all. Or if that's too abhorrent we could just drop the whole idea.
Regarding the Keylet renames, I don't really see the point? A Keylet is a GenericKeylet. We would have to go through and rename all of the current uses of Keylet to GenericKeylet. I don't feel like that really clarifies anything. Yes, the number of uses of Keylet will reduce over time. If the number of uses gets small enough in the future then we could consider the rename at that point? But it's not an over-my-dead-body kind of a thing. If that renaming makes more sense to you (and others) then I'm okay with it.
Regarding the AcctRoot name for the wrapper, I have no problems with fully spelling out Account. However something I've learned from watching @ckeshava implement more wrappers is that some of the wrappers are hard to name because the code base has already taken the good name. So I'm thinking that all of the wrappers might want to be named things like AccountLedgerObj, TicketLedgerObj, EscrowLedgerObj and the like. That will give the various typed objects a consistent naming. Or, I suppose, another approach would be to put them all in their own name space. So it would be, for example, ledgerObj::Account.
Thoughts? Opinions?
Objection 1 does not require private inheritance. It can be addressed by making specific methods private in the derived classes with using-declarations. Or you can have an intermediate base class that makes them protected. Or make them protected in STObject. Personally, I don't share this objection. I consider the primary benefit to be the convenience of the added strongly-typed specifically-named methods, an attractive carrot that sells itself, not the hiding of the generic methods.
For objection 2, the choices are either (a) downcast or (b) construct a wrapper. Both yield the same code when optimized (a no-op), but (a) doesn't have the same risk of calling a copy constructor when unoptimized, nor does it require an extra class that brings several problems (e.g. special methods, hiding methods by default, the maintenance cost of more code). I much prefer (a) to (b).
If renaming Keylet to GenericKeylet is not worthwhile, then why was Keylet renamed to KeyletBase? That was the original unworthwhile sin in my opinion. Now KeyletBase is littered in many more places in the code than Keylet, and after everything is migrated, we'll be stuck with KeyletBase and no Keylet. Why? It's ugly in my opinion.
Obj suffix or namespace prefix are both means of disambiguating names by making them longer. In these scenarios, I prefer namespaces because they can be optionally removed or abbreviated with using-declarations.
Objection 1 does not require private inheritance. It can be addressed by making specific methods private in the derived classes with using-declarations. Or you can have an intermediate base class that makes them protected. Or make them protected in
STObject.
I suspect I'm not following your meaning. I'm reading this as though you are suggesting that things that are public in a base class should be hidden (through whatever means) in a publicly derived class? This seems particularly unwise to me. It's easy to subvert the hiding by simply upcasting. It means that the public inheritance does not model is-a. The ISO C++ FAQ also thinks this is a very bad idea: https://isocpp.org/wiki/faq/proper-inheritance#hiding-inherited-public. But I'm probably misunderstanding your meaning.
Another alternative you suggest is making some currently public methods in STObject protected. I don't think the wrappers are worth anywhere close to the cost of dealing with that breakage. I prefer to leave STObject alone if at all possible, at least with regard to this particular change.
Personally, I don't share this objection. I consider the primary benefit to be the convenience of the added strongly-typed specifically-named methods, an attractive carrot that sells itself, not the hiding of the generic methods.
I hear this. You and I simply have different experiences with human nature. The SField interfaces in STObject have been in use for over 10 years now. I don't think that simply providing a specifically-named method will overcome the habit of reaching immediately for an SField. In my opinion the wrappers need to both...
- Provide a good interface and
- Actively discourage using what we think is the less good interface.
Otherwise we'll see mixed usage and be even worse off than we were before the wrappers.
Just my opinion.
For objection 2, the choices are either (a) downcast or (b) construct a wrapper. Both yield the same code when optimized (a no-op), but (a) doesn't have the same risk of calling a copy constructor when unoptimized, nor does it require an extra class that brings several problems (e.g. special methods, hiding methods by default, the maintenance cost of more code). I much prefer (a) to (b).
My personal view is that publicly deriving the wrappers from STLedgerEntry will result in a mess. My reasons for that opinion are given above.
If renaming
KeylettoGenericKeyletis not worthwhile, then why wasKeyletrenamed toKeyletBase?
Keylet was not renamed to KeyletBase. There used to be one name: Keylet. We added one typed keylet: AccountRootKeylet. So we now have two names: Keylet (the generic) and AccountRootKeylet (specific). There are some characteristics that the generic Keylet and all of the typed keylet(s) will share. It makes sense to push those shared characteristics into a base class: KeyletBase.
That was the original unworthwhile sin in my opinion. Now
KeyletBaseis littered in many more places in the code thanKeylet, and after everything is migrated, we'll be stuck withKeyletBaseand noKeylet. Why? It's ugly in my opinion.
Did you count them? In the original pull request there are 59 instances of KeyletBase. There are 13 instances of AccountRootKeylet. There are 207 instances of just Keylet (without "Base" or "AccountRoot"). It sounds like you are suggesting that it would be advantageous to rename 207 instances of Keylet to GenericKeylet so we can replace the 59 instances of KeyletBase with Keylet.
For now I would prefer the leave the 207 instances of Keylet and 59 instances of KeyletBase. As more wrappers are introduced it may eventually get to the point where occurrences of KeyletBase outnumber occurrences of Keylet. I suggest we wait and consider the rename you are suggesting if that time comes.
I clearly don't evaluate "sin", "litter", and "ugly" the same way you do. I am not convinced those words encourage a productive conversation. But you do you.
Objsuffix or namespace prefix are both means of disambiguating names by making them longer. In these scenarios, I prefer namespaces because they can be optionally removed or abbreviated with using-declarations.
Sure. I'm fine with using namespaces to corral the wrappers. But it seems like that is the least of our disagreements.
You are correctly understanding my meaning. I don't treat the ISO C++ FAQ as dogma, but I partly agree with the sentiment, in that (as I said before) I don't see primary value in hiding the generic public STObject methods. That is your desire, not mine, and I was offering a means to accommodate it that you had not mentioned. Upcasting to circumvent access controls sounds like a very clear expression of intent to accept the risks and responsibilities of calling the generic methods.
Access controls are a game we choose to play with the compiler. With the wrapper implementation, "it's easy to subvert the hiding by simply" declaring a friend. It is as simple as upcasting. If you feel like the one is not worth worrying about, for whatever reason, then why not both?
I prefer to leave STObject alone as well. It is still an option worth stating.
There are very few people with the "habit of reaching immediately for an SField". I think we can name them all, and I am betting they will all prefer the specifically-named methods.
It sounds like you really want protected inheritance then. I don't know why you are worried "that removes the availability of the public interfaces that were moved from LedgerEntryWrapper into STLedgerEntry". They can just be moved into another intermediate class that protected inherits from STLedgerEntry.
Your other concern was it "makes it harder to return an STLedgerEntry from one of the wrappers". How? You can do it with a member method in the exact same way. Here is a Godbolt demonstrating both workarounds:
class STLedgerEntry {
public:
int get();
};
using SLE = STLedgerEntry;
class TypedEntry : protected STLedgerEntry {
public:
SLE* sle() {
return this;
}
};
class AccountRoot : public TypedEntry {
public:
int id() {
return get();
}
};
int main() {
AccountRoot a;
//a.get(); // error: inaccessible
a.id();
//SLE* s1 = &a; // error: inaccessible
SLE* s2 = a.sle();
s2->get();
return 0;
}
I have a few questions with respect to @thejohnfreeman 's proposed idea.
-
Since you are not using aggregation to store a
std::shared_ptr<SLE>, does your code have significant performance benefits? I'm not sure how prevalently these ledger objects are passed around in the codebase. -
I share @scottschurr's Objection-1. If we return a
std::shared_ptr<SLE>from thereadandpeekfunctions, what is the benefit of this Ledger Entry Wrapper? "I consider the primary benefit to be the convenience of the added strongly-typed specifically-named methods, an attractive carrot that sells itself, not the hiding of the generic methods." -> The named methods are identical inSTObjectand also the proposed Strongly-typed wrapper class. What do you mean by "specifically-named" methods?
The end goal of strongly-typed ledger objects appears to be points 1-3 (Source: https://gist.github.com/scottschurr/c637b5bd04ac09157763bfb6e74d28ac)
If we return a
std::shared_ptr from read and peek functions, I'm not sure how we can work towards these stated goals. I'm aware that the proposed design by Scott Schurr and Scott Determan doesn't yet meet all these goals, but I can see a path going forward. But I couldn't quite get that from your proposed approach.
-
I feel
GenericKeyletis just as informative asKeyletBase. I don't have a strong preference for either pair of proposed names. -
I didn't understand @scottschurr's Objection-2. What do you mean by upcasting or downcasting? With respect to the later comments, I think it's a bad design to hide/privatise functions in child classes when
STObjectitself publicly exposes many generic functions. I'd instead prefer to change access controls inSTObjectitself in a separate set of changes.
I like that you are deriving the different ledger objects from STLedgerEntry instead of handling the different ledger entry types at the Keylet level.
At this point I think it's appropriate to back up and reflect on where this pull request came from. At the time this code was initially developed, about June of 2021, the work was considered a priority. I felt like @seelabs and I had a good design. Then NFToken development took precedent, so this work was put aside.
In August of 2022 I resurrected the work, even though it was no longer a priority. I felt like the design was in a good place and it would be a shame to lose the effort.
Here we are in June of 2023. The change being introduced is still not a high priority. The design is being called into question. I think that, with effort, the design differences could be settled out. But, given that this work is not high priority, I think it's better to put that effort elsewhere, rather than into the redesign of this change.
So, given all that, I think the best thing to do is close this pull request. @intelliot, what do you think?
I concur with @scottschurr .
We still haven't factored in the design effort required for the directory node ledger objects. We need to adapt the design slightly in order to accomodate those objects.
Given the benefits to these changes aren't crystal clear, I'm happy to go with @scottschurr and @ckeshava 's recommendation. If you feel that we've learned enough for now and no code changes to rippled are necessary, then go ahead and close this PR.
Closing due to a mismatch between the apparent value of the change and the cost of resolving design disagreements.