sparklens icon indicating copy to clipboard operation
sparklens copied to clipboard

How does source=history option work?

Open Shasidhar opened this issue 6 years ago • 4 comments

I am trying to run sparklens on event logs of my application.

I am using following command

./bin/spark-submit \
	--packages qubole:sparklens:0.2.0-s_2.11 \
	--master local[0] \
	--class com.qubole.sparklens.app.ReporterApp \
	qubole-dummy-arg file:///Users/shasidhar/interests/sparklens/eventlog.txt source=history

I see following output in console

Ivy Default Cache set to: /Users/shasidhar/.ivy2/cache
The jars for the packages stored in: /Users/shasidhar/.ivy2/jars
:: loading settings :: url = jar:file:/Users/shasidhar/interests/spark/spark-2.3.0-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
qubole#sparklens added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
	confs: [default]
	found qubole#sparklens;0.2.0-s_2.11 in spark-packages
:: resolution report :: resolve 177ms :: artifacts dl 5ms
	:: modules in use:
	qubole#sparklens;0.2.0-s_2.11 from spark-packages in [default]
	---------------------------------------------------------------------
	|                  |            modules            ||   artifacts   |
	|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
	---------------------------------------------------------------------
	|      default     |   1   |   0   |   0   |   0   ||   1   |   0   |
	---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
	confs: [default]
	0 artifacts copied, 1 already retrieved (0kB/6ms)
2019-01-03 15:46:11 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Warning: Local jar /Users/shasidhar/interests/spark/spark-2.3.0-bin-hadoop2.7/qubole-dummy-arg does not exist, skipping.

2019-01-03 15:46:52 INFO  ShutdownHookManager:54 - Shutdown hook called
2019-01-03 15:46:52 INFO  ShutdownHookManager:54 - Deleting directory /private/var/folders/3t/rfd2djjs1yg30mhmw8z_s7tw0000gp/T/spark-7a992110-6a4f-44f4-9473-1ddade11b53a

What exactly I need to look at after this? Does it generate sparklens json file? If yes, where I can see the output file?

Shasidhar avatar Jan 03 '19 14:01 Shasidhar

Hi @Shasidhar,

I will expect this to print usual sparklens report on the console. We don't really support converting event history file to sparklens json yet (will be adding soon). Here is how we generate sparklens.json from a running application.

--packages qubole:sparklens:0.2.0-s_2.11
--conf spark.extraListeners=com.qubole.sparklens.QuboleJobListener
--conf spark.sparklens.reporting.disabled=true
--conf spark.sparklens.data.dir=/dir/for/saving/sparklens.json

iamrohit avatar Jan 03 '19 19:01 iamrohit

@iamrohit Understood, I think for some reason I don't see the report then

Shasidhar avatar Jan 07 '19 12:01 Shasidhar

@Shasidhar May be something wrong with your event log file? Can you try running with this file [sparklens/src/test/event-history-test-files/local-1532512550423] and check if you still don't get any results?

iamrohit avatar Jan 08 '19 05:01 iamrohit

@iamrohit Yes looks like an issue with my event logs. WIll figure it out thanks. Is there an issue or something which I can follow for the feature which will generate the sparklens.json file from event logs?

Shasidhar avatar Jan 08 '19 12:01 Shasidhar