spring-framework
spring-framework copied to clipboard
Improve performance of ProtobufHttpMessageConverter
Improved performance for ProtobufHttpMessageConverter
Currently, when a binary message in the form of ProtoBuf is received, data is created using the Builder via ProtobufHttpMessageConverter
.
When Protobuf creating individual com.google.protobuf.Message
. it creates a PROTO_MESSAGE parseFrom(java.io.InputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry)
method
If use this method, ProtobufHttpMessageConverter can reduce creating unnecessary Builder and executing merge method process.
(And I have confirmed that grpc-java
use this method when creating 'com.google.protobuf.Message'.)
If change the object creation to the parseFrom
method, I made a simple test and confirmed that the TPS increased by about 5% in several simple performance tests.
(I have confirmed that the slope of increasing heap improved.)
- as-is
h2load http://localhost:8080/user/proto -d ./proto.txt --header 'Content-Type: application/x-protobuf;charset=UTF-8' -n 100000 -c 1 -m 1
starting benchmark...
spawning thread #0: 1 total client(s). 100000 total requests
Application protocol: h2c
progress: 10% done
progress: 20% done
progress: 30% done
progress: 40% done
progress: 50% done
progress: 60% done
progress: 70% done
progress: 80% done
progress: 90% done
progress: 100% done
finished in 24.07s, 4155.29 req/s, 146.15KB/s
requests: 100000 total, 100000 started, 100000 done, 100000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 100000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 3.43MB (3601612) total, 488.93KB (500660) headers (space savings 96.26%), 390.63KB (400000) data
min max mean sd +/- sd
time for request: 187us 13.12ms 234us 105us 94.49%
time for connect: 301us 301us 301us 0us 100.00%
time to 1st byte: 2.11ms 2.11ms 2.11ms 0us 100.00%
req/s : 4155.31 4155.31 4155.31 0.00 100.00%
- to-be
h2load http://localhost:8080/user/proto -d ./proto.txt --header 'Content-Type: application/x-protobuf;charset=UTF-8' -n 100000 -c 1 -m 1
starting benchmark...
spawning thread #0: 1 total client(s). 100000 total requests
Application protocol: h2c
progress: 10% done
progress: 20% done
progress: 30% done
progress: 40% done
progress: 50% done
progress: 60% done
progress: 70% done
progress: 80% done
progress: 90% done
progress: 100% done
finished in 23.31s, 4289.18 req/s, 150.86KB/s
requests: 100000 total, 100000 started, 100000 done, 100000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 100000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 3.43MB (3601566) total, 488.88KB (500614) headers (space savings 96.26%), 390.63KB (400000) data
min max mean sd +/- sd
time for request: 177us 274.17ms 227us 876us 99.97%
time for connect: 331us 331us 331us 0us 100.00%
time to 1st byte: 254.55ms 254.55ms 254.55ms 0us 100.00%
req/s : 4289.20 4289.20 4289.20 0.00 100.00%
Rather than using a builder, there is a risk for method presence, so I made a map for present method.
Please feel free to let me know if I misjudged anything. :)
@koo-taejin Please sign the Contributor License Agreement!
Click here to manually synchronize the status of this Pull Request.
See the FAQ for frequently asked questions.
@koo-taejin Thank you for signing the Contributor License Agreement!
Thanks for this contribution, and sorry for getting back to you so long after. I've introduced a JMH benchmark in 31a62ff8ba for the converter (both for reading and writing operations). Maybe the runtime profile changed since your PR, but I'm not seeing improvements with your change (it might have made things a bit worse actually).
Running with java -jar spring-web/build/libs/spring-web-6.1.0-SNAPSHOT-jmh.jar -t 30 -f 2 -prof gc ProtobufHttpMessageConverterBenchmark
Without this change:
Benchmark (messageCount) Mode Cnt Score Error Units
ProtobufHttpMessageConverterBenchmark.readMessages 40 thrpt 10 66492.262 ± 397.655 ops/s
ProtobufHttpMessageConverterBenchmark.readMessages:·gc.alloc.rate 40 thrpt 10 12079.125 ± 72.156 MB/sec
ProtobufHttpMessageConverterBenchmark.readMessages:·gc.alloc.rate.norm 40 thrpt 10 190556.022 ± 758.571 B/op
ProtobufHttpMessageConverterBenchmark.readMessages:·gc.count 40 thrpt 10 1645.000 counts
ProtobufHttpMessageConverterBenchmark.readMessages:·gc.time 40 thrpt 10 1412.000 ms
ProtobufHttpMessageConverterBenchmark.writeMessages 40 thrpt 10 48658.466 ± 715.484 ops/s
ProtobufHttpMessageConverterBenchmark.writeMessages:·gc.alloc.rate 40 thrpt 10 12076.030 ± 195.945 MB/sec
ProtobufHttpMessageConverterBenchmark.writeMessages:·gc.alloc.rate.norm 40 thrpt 10 260480.029 ± 0.001 B/op
ProtobufHttpMessageConverterBenchmark.writeMessages:·gc.count 40 thrpt 10 1791.000 counts
ProtobufHttpMessageConverterBenchmark.writeMessages:·gc.time 40 thrpt 10 1427.000 ms
With this change:
Benchmark (messageCount) Mode Cnt Score Error Units
ProtobufHttpMessageConverterBenchmark.readMessages 40 thrpt 10 64359.735 ± 1712.806 ops/s
ProtobufHttpMessageConverterBenchmark.readMessages:·gc.alloc.rate 40 thrpt 10 11660.432 ± 306.253 MB/sec
ProtobufHttpMessageConverterBenchmark.readMessages:·gc.alloc.rate.norm 40 thrpt 10 190072.022 ± 12.749 B/op
ProtobufHttpMessageConverterBenchmark.readMessages:·gc.count 40 thrpt 10 1996.000 counts
ProtobufHttpMessageConverterBenchmark.readMessages:·gc.time 40 thrpt 10 1664.000 ms
ProtobufHttpMessageConverterBenchmark.writeMessages 40 thrpt 10 48545.755 ± 362.536 ops/s
ProtobufHttpMessageConverterBenchmark.writeMessages:·gc.alloc.rate 40 thrpt 10 12052.676 ± 90.700 MB/sec
ProtobufHttpMessageConverterBenchmark.writeMessages:·gc.alloc.rate.norm 40 thrpt 10 260512.030 ± 0.001 B/op
ProtobufHttpMessageConverterBenchmark.writeMessages:·gc.count 40 thrpt 10 2296.000 counts
ProtobufHttpMessageConverterBenchmark.writeMessages:·gc.time 40 thrpt 10 1743.000 ms
In light of that, I'm declining this PR for now. We can always consider other changes if they're backed by benchmark results. Thanks!