[tez] Add init action to enable tez-aux-services
This is an issue requesting to implement tez-aux-services support as an initialization-actions script.
Take a look at the configure_yarn_nodemanager function in spark-rapids.sh[1] for an example of how to set properties in yarn-site.xml. The same code can be used as an example of how to update tez-site.xml as described in the shuffle handler overview[2].
It says here[2] that The Tez Shuffle Handler jar artifact org.apache.org:tez-aux-services needs to be placed into the Node Manager classpath and restarted but does not indicate where to find the jar artifact org.apache.org:tez-aux-services . Thankfully, I was able to look it up pretty quickly. https://mvnrepository.com/artifact/org.apache.tez/tez-aux-services
Be sure to use the following tez-aux-services version based on your dataproc image version.
dataproc 2.0: tez-aux-services 0.9.2 [3]
dataproc 2.1: tez-aux-services 0.10.2 [4]
dataproc 2.2: tez-aux-services 0.10.2 [4]
[1] https://github.com/GoogleCloudDataproc/initialization-actions/blob/master/spark-rapids/spark-rapids.sh#L568
[2] https://tez.apache.org/shuffle-handler.html
[3] https://repo1.maven.org/maven2/org/apache/tez/tez-aux-services/0.9.2/tez-aux-services-0.9.2.jar
[4] https://repo1.maven.org/maven2/org/apache/tez/tez-aux-services/0.10.2/tez-aux-services-0.10.2.jar
Any users who would like an easy introduction to contributing to the repository may want to consider this as an introductory exercise.