[FEATURE] Print out the task shuffle write time statistics by table format in log
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before asking
- [X] I have searched in the issues and found no similar issues.
Describe the feature
Currently, we can't find out the slow shuffle-server in the client side log, this is not ease to use, especially for server with terriable GC.
I hope the metric with the server dimension for one spark task could be shown like the following table format in the client side log.
| Min | 25th percentile | Median | 75th percentile | Max | |
|---|---|---|---|---|---|
| Shuffle Write Duration | 10s | 15s | 25s | 30s | 4min |
| Shuffle Write Size / Records | 20M / 10000 | 20M / 10000 | 20M / 10000 | 20M / 10000 | 20M / 10000 |
| Shuffle server list | 10.23.35.19-21001 | 10.23.35.134-21001 | 10.23.35.14-21001 | 10.23.315.14-21001 | 10.123.35.14-21001 |
Motivation
No response
Describe the solution
No response
Additional context
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Feel free to pick this up.
Are you interested on this? @myandpr
Are you interested on this? @myandpr
Fine,please assign to me, thanks @zuston !
Should we use Spark's metrics instead of log?
Should we use Spark's metrics instead of log?
This may be also valid for MR and Tez. BTW, the task metrics for low shuffle-server don't exist. I haven't seen this metric.
Should we use Spark's metrics instead of log?
This may be also valid for MR and Tez. BTW, the task metrics for low shuffle-server don't exist. I haven't seen this metric.
This should be a metric instead of log. The log seems weird.
So how to meet the requirement if using metrics ? @jerqi Can you give some ideas if you want
So how to meet the requirement if using metrics ? @jerqi Can you give some ideas if you want
Spark metrics system allows user to add extra metrics. You can refer to https://simhadri-g.medium.com/custom-metrics-source-in-apache-spark-ca30a3b362dd
So how to meet the requirement if using metrics ? @jerqi Can you give some ideas if you want
Spark metrics system allows user to add extra metrics. You can refer to https://simhadri-g.medium.com/custom-metrics-source-in-apache-spark-ca30a3b362dd
Looks good. ping @myandpr If you have any idea, feel free to discuss
So how to meet the requirement if using metrics ? @jerqi Can you give some ideas if you want
Spark metrics system allows user to add extra metrics. You can refer to https://simhadri-g.medium.com/custom-metrics-source-in-apache-spark-ca30a3b362dd
Looks good. ping @myandpr If you have any idea, feel free to discuss
fine,I think I understand this feature. Let me look at the logic about spark metrics system.
So how to meet the requirement if using metrics ? @jerqi Can you give some ideas if you want
Spark metrics system allows user to add extra metrics. You can refer to https://simhadri-g.medium.com/custom-metrics-source-in-apache-spark-ca30a3b362dd
Looks good. ping @myandpr If you have any idea, feel free to discuss
fine,I think I understand this feature. Let me look at the logic about spark metrics system.
Maybe we could introduce extra tab page in spark UI(runtime + history server) to show more infos about shuffle-servers, like kyuubi did. https://github.com/apache/kyuubi/tree/master/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui