rippled
rippled copied to clipboard
Replace array & linear lookup with map
The RPC command handling code requires a thorough cleanup.
A good start would be to refactor the command map and the lookup code:
https://github.com/ripple/rippled/blob/develop/src/ripple/net/impl/RPCCall.cpp#L1154-L1249
Among the improvements, per the comment, the array chould be replaced with a static const std::map<std::string_view, Command>; at the same time the name field should be removed from Command.
This may or not be faster. Testing to determine whether lookups are consistently faster using a hash table should be done, and if they indicate a performance gain, the change should be implemented. Otherwise, the comment should be removed.
The map initialization could be similar to this:
https://github.com/ripple/rippled/blob/develop/src/ripple/protocol/impl/TER.cpp#L35-L40
can I work on this issue??
Yes, @Aena11, it doesn't look like anybody else is making forward progress on this at the moment. Your contribution would be welcome.
The links provided by @nbougalis are a little out of date. You would be changing this command map: https://github.com/ripple/rippled/blob/develop/src/ripple/net/impl/RPCCall.cpp#L1222-L1301
to look something vaguely like this: https://github.com/ripple/rippled/blob/develop/src/ripple/protocol/impl/TER.cpp#L40-L170
Note that @nbougalis indicates that the change needs testing.
Once the change is made we (you?) should time the new vs the old lookup times using a release build. If the map improves lookup times (or even leaves them the same) then we would want the new code. But if looking up through a map is, on average, slower than the linear lookup (which can happen for small tables like this) then we would not make the change. But we would remove the comment that suggests the change.
@scottschurr, thanks for the reply I will start working on this
@scottschurr, thanks for the reply I will start working on this Greetings, wanted to see if this was finished? If not id love to work on it
@brianfranco141, this work has not been done and I've seen no evidence of pull requests aimed at addressing it. There's been no news on this from @Aena11 for 6 months as far as I'm aware.
As this issue is open for a long time, I would like to work on it if currently, no one is working on it.
Hello,
I profiled the performance of the code with these four data structures:
Linear Search on constexpr array: MyTimer object -- conversions: 126976; avg ns: 2927; longest ns: 793916 ranges -- 0-9ns: 0; 10-99ns: 389; 100-999ns: 44156; 1000-9999ns: 79586; 10000-99999ns: 2827; 100000ns and over: 18
Binary Search on a constexpr sorted array: MyTimer object -- conversions: 127299; avg ns: 3949; longest ns: 6460959 ranges -- 0-9ns: 0; 10-99ns: 83; 100-999ns: 26735; 1000-9999ns: 98521; 10000-99999ns: 1810; 100000ns and over: 150
Std::map data structure: MyTimer object -- conversions: 127299; avg ns: 3998; longest ns: 2221792 ranges -- 0-9ns: 0; 10-99ns: 123; 100-999ns: 21839; 1000-9999ns: 103047; 10000-99999ns: 2119; 100000ns and over: 171
Std::unordered_map data structure: MyTimer object -- conversions: 127299; avg ns: 2666; longest ns: 2667042 ranges -- 0-9ns: 94; 10-99ns: 1679; 100-999ns: 43767; 1000-9999ns: 81085; 10000-99999ns: 653; 100000ns and over: 21
Since the performance of std::unordered_map and linear search through the array is nearly identical, I don't think it warrants a change. If the number of rpc-handler-commands increase in the future, there might be a scope for improvement.
Many thanks to Scott Schurr for providing the profiling code and debugging, brainstorming this issue.
[Observations as of July 2022]