rippled
rippled copied to clipboard
Enable by default the compression support
High Level Overview of Change
Enable link compression by default.
Context of Change
The server can save bandwidth by compressing its p2p communications at the cost of greater CPU usage. Servers that have link compression enabled will automatically compress communications with peers that also have link compression enabled.
In rippled.cfg, you can enable compression with:
[compression]
true
Use false to disable compression. Prior to this PR, false is the default.
Restart the server software. After the restart, your server automatically uses link compression with other peers that also have link compression enabled.
Link compression is not currently mentioned in rippled-example.cfg, but it should be.
Type of Change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] Refactor (non-breaking change that only restructures code)
- [ ] Tests (You added tests for code that already exists, or your new feature included in this PR)
- [ ] Documentation Updates
- [ ] Release
Note: Protocol message compression support was added in https://github.com/XRPLF/rippled/pull/3287
Ping: @gregtatcam @nbougalis @seelabs @HowardHinnant
I'm curious if there is a stats available on the compression adaption rate?
I do not know about stats of compression sorry! Perhaps Ripple can check from their r.ripple clusters?
is a stats available on the compression adaption rate?
@xzhaous could you check into this?
I would prefer more testing before making it as default. There could be memory and cpu increases that may affect some nodes. I will enable it on a portion of the zaphod hubs and monitor (currently only a couple have it enabled). Recommend that Ripple also enable compression similarly, if not already done.
Have you seen any memory change from this @alloynetworks and @intelliot ?? Making the compressor obligatory will help to reduce the memory usage because now server must make two copies of every message if one other peer server has the compression.
Are there any updates @alloynetworks and @intelliot???
Are there any updates @alloynetworks and @intelliot???
I don’t see any performance issues on my nodes. But for reference these are all high end servers with a minimum of 128GB RAM. I don’t know if others also performed tests on different hardware specs.
This should merge or i should make patch to remove compression code from code.
I ran some analysis on how many nodes support the compression. First, I crawled the network and got 702 nodes. Out of 702, I can connect to 75 nodes via the protocol. The majority of other nodes either time out to connect or respond with the service is unavailable. 50 nodes out of the 75 connected have the compression enabled. The vast majority of the 75 nodes are hubs; i.e. have a large number of connected peers. We can then assume that the compression adaption by the hubs is 66%. I think, given the adaption rate, it makes sense to turn the compression on by default. It is of course natural for hubs to have the compression enabled as they could save the most on the bandwidth. But to have a significant effect on the network, majority of the nodes have to have the compression enabled.
@gregtatcam How did you query the servers if they are using the compression support? Does rippled expose any APIs for such queries or are you using an indirect method to detect compression support?
@a-noni-mousse this looks good to me. I'm not able to build this branch locally, but that is due to a problem in my build system. The CI appears to be happy on github 👍
Question: Despite this change, even if one of the validators on my UNL has compression disabled, then I will need to keep two copies (compressed and uncompressed version) for every message right?
For this change to reap benefits, all of the validators on my UNL need to have compression enabled. (My understanding is that a validator only communicates with the other peers on the UNL, hence those peers which are not on its UNL do not affect the communication complexity of a validator.)
@gregtatcam How did you query the servers if they are using the compression support? Does rippled expose any APIs for such queries or are you using an indirect method to detect compression support?
I sent the protocol handshake, which requested the compression enabled. The response indicates if the compression is supported by the connected peer.
@Gregory Tsipenyuk @.***> Ok, thanks for the clarification. Regards, Keshava
On Wed, Aug 9, 2023 at 11:42 AM Gregory Tsipenyuk @.***> wrote:
@gregtatcam https://urldefense.com/v3/__https://github.com/gregtatcam__;!!PZTMFYE!_WMt0dijPncxc7697YOipVeCYsQYLUY_5vDyCMU-5tJ4gyoQNUQ9V-tqmobxPq2NLBcD6fVFIR98lWKINys0d_hs3Q$ How did you query the servers if they are using the compression support? Does rippled expose any APIs for such queries or are you using an indirect method to detect compression support?
I sent the protocol handshake, which requested the compression enabled. The response indicates if the compression is supported by the connected peer.
— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/XRPLF/rippled/pull/4387*issuecomment-1671949449__;Iw!!PZTMFYE!_WMt0dijPncxc7697YOipVeCYsQYLUY_5vDyCMU-5tJ4gyoQNUQ9V-tqmobxPq2NLBcD6fVFIR98lWKINyup5WXKSQ$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AFB4TNPGU5Q2EKLE3JXGHUDXUPK2XANCNFSM6AAAAAAT3SZLT4__;!!PZTMFYE!_WMt0dijPncxc7697YOipVeCYsQYLUY_5vDyCMU-5tJ4gyoQNUQ9V-tqmobxPq2NLBcD6fVFIR98lWKINytl4U6R-A$ . You are receiving this because your review was requested.Message ID: @.***>
note: link compression is documented here - https://xrpl.org/enable-link-compression.html
but it is not mentioned in rippled-example.cfg - https://github.com/XRPLF/rippled/blob/develop/cfg/rippled-example.cfg
I got some additional stats with the compression enabled/disabled. I ran two mainnet connected nodes in GCP, one compression enabled, and the other one compression disabled. Each node configuration is: c2-standard-16, 8 cores, 64GB RAM, 40GB SSD, Ubuntu 22.02. The nodes started/stopped at the same time and ran for one hour. The following stats was collected:
CPU% avg/sd compression-enabled 167.713333 146.739502
compression-disabled 127.081667 129.496871
MEM% avg/sd compression-enabled 26.060000 6.914372
compression-disabled 27.218333 5.246096
send bytes/count compression-enabled compressed: 63125236/126134, not-compressed: 464399529/1780125, total: 527524765/1906259, total uncompr 616242813, saved 88718048/%14.40 type description bytes count 41 validation 230376695 939120 31 get_ledger 223351065 238954 33 propose_ledger 77162125 396397 30 transaction 59656606 242666 32 ledger_data 19809189 1824
compression-disabled compressed: 0/0, not-compressed: 1197376396/4484622, total: 1197376396/4484622, total uncompr 1197376396, saved 0/%0.00 type description bytes count 41 validation 583198110 2369076 31 get_ledger 226159844 315242 33 propose_ledger 192468072 984526 30 transaction 150997436 610506 32 ledger_data 25135890 2020
receive bytes/count compression-enabled compressed: 14521059434/321787, not-compressed: 7439465988/2184543, total: 21960525422/2506330, total uncompr 26931822005, saved 4971296583/%18.46 type description bytes count 32 ledger_data 18921397421 225388 42 get_objects 7490592584 10779 41 validation 305606783 1271725 33 propose_ledger 105306290 555726 30 transaction 84257365 346090
compression-disabled compressed: 0/0, not-compressed: 16798953795/2418596, total: 16798953795/2418596, total uncompr 16798953795, saved 0/%0.00 type description bytes count 32 ledger_data 12412100317 149228 42 get_objects 3876726299 5636 41 validation 303663929 1263455 33 propose_ledger 105591736 557233 30 transaction 83763453 346943
Compression does require more CPU, but the memory usage is about the same. Higher CPU might also be due to higher bandwidth of received messages. Compression-enabled node received ~ 21.9GB, while compression-disabled node received ~ 16.8GB. The difference is mostly due to the LEDGER_DATA messages. On the other hand, compression-enabled node sent only ~1.9M message, while compression-disabled node sent ~4.5M messages. But the sent messages don't contribute as much to the bandwidth as the received messages. Compression does contribute to a sizable bandwidth savings of 14.4% on send and 18.46% on receive.
thanks for the stats @gregtatcam
@ckeshava - can you look at the code (and if possible, write a unit test) to confirm that even after the change in this PR, compression can still be fully disabled by adding the following to rippled.cfg?
[compression]
false
hello @intelliot ,
I discussed the possibility of unit tests with @gregtatcam . He has already written compatibility unit tests here: https://github.com/XRPLF/rippled/blob/a948203dae52093960e38583b1bd8347368a07d4/src/test/overlay/compression_test.cpp#L483. These handshake tests ensure that if two peers want to communicate with each other, they need to have identical compression settings.
Secondly, @gregtatcam has collected statistics about the bandwidth, CPU and memory consumption with the different compression settings. I can repeat the same experiment and verify the values.
I have a question for @manojsdoshi and his team. Do you have integration tests where the validators have different compression settings? I believe such tests are more reliable in ensuring that the code doesn't break. The ad-hoc experiments above serve as sanity checks, I'd prefer to include these tests in a regular pipeline.
Internal tracker: RPFC-108
Note: currently blocked by perf sign off. Low priority.
We are trying to get to the low priority ones in the next two weeks. Thanks for the reminder Elliot.
On Fri, May 31, 2024 at 11:19 AM Elliot Lee @.***> wrote:
Note: currently blocked by perf sign off. Low priority.
— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/XRPLF/rippled/pull/4387*issuecomment-2142766065__;Iw!!PZTMFYE!8qh0ofZY5FCyV8mOe-JeP5WnhiYcbi0-ci89-q0FV6O8grQ5YblLQpjF9Fg48Z0b7vrylafCoNeE6FFWTSQE$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AZKCD24WLYGMHJQP5ICDQDTZFC5MZAVCNFSM6AAAAAAT3SZLT6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBSG43DMMBWGU__;!!PZTMFYE!8qh0ofZY5FCyV8mOe-JeP5WnhiYcbi0-ci89-q0FV6O8grQ5YblLQpjF9Fg48Z0b7vrylafCoNeE6JZphhXA$ . You are receiving this because your review was requested.Message ID: @.***>
The comparison testing is done against 2.2.0-rc3 between compression enabled by default and without compression.
Without the compression, the transaction tps is around 282.6. With the compression, the transaction tps is around 264.46, which is around 6% deduction. I did not observe any cpu usage increase but some memory increase when the compression is enabled by default. Since there is a small dip, I would recommend not to enable the compression by default feature.
The comparison testing is done against 2.2.0-rc3 between compression enabled by default and without compression.
Without the compression, the transaction tps is around 282.6. With the compression, the transaction tps is around 264.46, which is around 6% deduction. I did not observe any cpu usage increase but some memory increase when the compression is enabled by default. Since there is a small dip, I would recommend not to enable the compression by default feature.
Was the measurement done over one run or over multiple runs?
The comparison testing is done against 2.2.0-rc3 between compression enabled by default and without compression. Without the compression, the transaction tps is around 282.6. With the compression, the transaction tps is around 264.46, which is around 6% deduction. I did not observe any cpu usage increase but some memory increase when the compression is enabled by default. Since there is a small dip, I would recommend not to enable the compression by default feature.
Was the measurement done over one run or over multiple runs?
Hi @gregtatcam It is done over multiple runs of 1 hour test. I did not observe significant cpu increase but around 10% of memory increase
Looks like we can't merge this PR due to the impact to the network/validators. We only tested on an internal network with limited nodes, the impact could be more when the # of peers increases.
Closing due to:
The comparison testing is done against 2.2.0-rc3 between compression enabled by default and without compression.
Without the compression, the transaction tps is around 282.6. With the compression, the transaction tps is around 264.46, which is around 6% deduction. I did not observe any cpu usage increase but some memory increase when the compression is enabled by default. Since there is a small dip, I would recommend not to enable the compression by default feature.
Feel free to re-open (or open a new PR) in the future if anything changes.