Unable to configure logging of Spark app
Despite everything I have tried, I still cannot figure out how to configure logging for my Spark application. The log4j.properties file is always ignored.
log4j.properties in the resources
If I put my log4j.properties file in the resources directory of my project, then it is present in the classpath (I have checked it in the assembly jar). Then when I run my application locally on my machine, my configuration is taken into account. However when I submit the same jar with AZTK, then my configuration is ignored.
File linked at submission
Next, I tried to explicitly link the log4j.properties at submission using the --files option, and then to specify which logging configuration file to use as a JVM option. The command line looks like :
aztk spark cluster submit --id my_cluster --name my_app --files="/path/to/log4j.properties" --driver-java-options="-Dlog4j.configuration=./log4j.properties" /path/to/my_app-assembly-0.1.jar
According to the research I have made, this should work. However, the file is still ignored.
File on ADL
Another attempt is to put the log4j.properties file on Azure Data Lake, and then to specify the location as a JVM option. The command then looks like :
aztk spark cluster submit --id my_cluster --name my_app --driver-java-options="-Dlog4j.configuration=adl://gazmetrop01.azuredatalakestore.net/path/to/log4j.properties" /path/to/my_app-assembly-0.1.jar
Again, my logging configuration is still ignored.
Console output
Whatever I tried, I always get the same complaining message from the Spark application :
log4j:WARN No appenders could be found for logger (com.my.app.Main$).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
So, is this a bug or am I doing something wrong ?
I believe this is a bug. Thanks for pointing it out. In the meantime, here is a workaround:
Can you try to add the log4j.properties file through your .aztk/spark-defaults.conf?
This would be done by adding the file to your spark.driver.extraJavaOptions.
Here is an example:
spark.driver.extraJavaOptions -Dlog4j.configuration=file://log4j.properties
Also, as a side note,in the future, we will be adding a --conf flag to aztk spark cluster submit that will function like the --conf flag for spark-submit. This should also allow you to pass arbitrary configuration values like a log4j configuration.
@JunkieLand were you able to solve this issue with the suggestion above?