incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

[Performance Optimization] Multiple channels when getting shuffle data in client side

Open zuston opened this issue 3 years ago • 6 comments

Motivation

Now the executor only will use the single TCP connection with the specified shuffle server, so when multiple tasks are running concurrently, it will share this channel. Maybe it will reduce the whole throughput.

Do we have any plan to introduce extra config to allow user to create more channels in client side?

Maybe we should do some performance test to prove this improvement effective. The update will be included in this ticket.

zuston avatar Jul 28 '22 03:07 zuston

@zuston The current implementation limit the number of connection, because don't want too many connection established between client and shuffle server. We also plan to improve this by using netty instead of Grpc, not only about throughput but also reduce unnecessary serialization in Grpc.

colinmjj avatar Jul 28 '22 03:07 colinmjj

Glad to hear this. From the flame graph, due to extra memory-copy, it cost too much time in shuffle server side.

If using the netty to directly manipulate shuffle data bytebuf in off-heap, it may get better performance.

Besides, do we need to w/r shuffle data to files using nio to match with the netty? Please let me know whether i'm wrong.

zuston avatar Jul 28 '22 03:07 zuston

@zuston The current implementation limit the number of connection, because don't want too many connection established between client and shuffle server. We also plan to improve this by using netty instead of Grpc, not only about throughput but also reduce unnecessary serialization in Grpc.

Could we put this into the 0.7 version plan?

jerqi avatar Jul 28 '22 03:07 jerqi

@zuston The current implementation limit the number of connection, because don't want too many connection established between client and shuffle server. We also plan to improve this by using netty instead of Grpc, not only about throughput but also reduce unnecessary serialization in Grpc.

Could we put this into the 0.7 version plan?

Yes, I think so

colinmjj avatar Jul 28 '22 03:07 colinmjj

So do we have a roadmap of version-release/features in github? @jerqi

zuston avatar Jul 28 '22 08:07 zuston

So do we have a roadmap of version-release/features in github? @jerqi

Not yet. 0.6 version's main feature is to support K8S.

jerqi avatar Jul 28 '22 09:07 jerqi