opencbdc-tx
opencbdc-tx copied to clipboard
AoS optimization for full_tx
Right now, a full_tx
is an SoA (“struct of arrays”). This means, at minimum, we include 8 extra bytes in its representation (the count of witnesses is superfluous because all valid transactions must have an identical number of witnesses to inputs). Moreover, this likely poses an optimization opportunity (because when we are accessing a particular witness, we are also likely to be accessing its associated input; as a result, greater cache locality can probably be achieved by storing them adjacent to one another).
To implement this, we would add a new structure that pairs a witness with its associated input, and update full_tx
to have a vector of these structs rather than grouping together the two vectors.
Note: this change might be rendered immaterial should we implement #6
Is this optimization in a part of the code that's been identified as a bottleneck via profiling? Also, what's the likelihood of this issue being rendered immaterial due to #6?
Is this optimization in a part of the code that's been identified as a bottleneck via profiling?
Definitely not; it's just a particularly well-known optimization that might be applicable. If anyone's interested in implementing it, they should definitely do some basic profiling in-advance to know that this will even help.
Also, what's the likelihood of this issue being rendered immaterial due to #6?
I've not seen immediate interest from anyone in implementing #6, so it's difficult to say if that will be merged at some point.
@HalosGhost Thanks for the quick reply! I'm interested but, as you pointed out, profiling will be important and I haven't done it yet for this code. I'm wondering how to approach it since I'm guessing the bottlenecks will change depending on run configurations. Has anyone made a list of what configurations impact performance and what their sensitivities are in somewhat realistic scenarios? The white paper by Lovejoy et al. says the authors, "ran sweeps with increasing load over a number of system configurations," but as far as I can tell, it doesn't describe what configurations were considered or what the sweep results were.
@HalosGhost On a related note, I’m wondering if and how you'll measure changes to performance as part of the merge process. It looks like the Standards for Merging Contributions don't discuss this specifically, but given that performance is the first goal discussed in the white paper--and maybe the only goal described quantitatively--I'm thinking that you might find it important to measure how merges impact it. Am I on the right track or are you more interested in other parts of the project right now?
Has anyone made a list of what configurations impact performance and what their sensitivities are in somewhat realistic scenarios?
Not to my knowledge. We have some ideas (talked about in the paper) of the scalability properties of the system, and therefore what you might expect to happen by increasing or decreasing the number of a given component. We don't, though, have a list of configurations in which you could run the system and how you might expect each configuration to perform.
The white paper by Lovejoy et al. says the authors, "ran sweeps with increasing load over a number of system configurations," but as far as I can tell, it doesn't describe what configurations were considered or what the sweep results were.
@metalicjames can step in here to clarify the details, but the sweeps were used to find peak performance. So, the figures in the paper which show peak performance were discovered by those sweeps; and most of them show the most relevant parameters as the performance increases (e.g., an increasing number of shards, sentinels, and coordinators per data point). Is there a particular figure you're interested in?
As for performance testing moving forward, it's definitely something we are trying to keep an eye on, but we are not using it as a measuring stick for merge. As large bodies of work are preparing to be merged (for example, the tamper-detection PRs), we're doing some performance testing behind-the-scenes to see what the impact is (the venue for making that information public hasn't been fully fleshed out yet, though it is planned); but even if something decreases performance significantly, that doesn't necessarily mean it won't be merged. In some sense, merging in this repository is more a stance of “We think this is a useful contribution exploring trade-offs or potential solutions to some research aspect of CBDCs.”