pyconcrete
pyconcrete copied to clipboard
pyconcrete for submitting spark job
Hi,
I recently used pyconcrete to obfuscate pyspark codes. To run a spark job on a cluster, we need to use spark-submit command. So it would look like spark-submit job.py.
The concern here is that spark-submit seems to only accept .py extension in order for it to work. Since pyconcrete generates .pye files, I didn't find any way to run the encrypted files via spark-submit.
Is there a way to run encrypted files generated by pyconcrete with spark-submit?
Thank you.
pyconcrete need binary .so, does spark-submit package your source code and upload to cloud for running? if yes, you need cross-compile pyconcrete.so first. And then you could run pyconcrete as library, try to build your code as .egg, spark seems allow you submit .egg, maybe it should work. Give it a shot.
Already tried build code as .egg along with the driver program. But spark couldn't find the main class.
It seems that .egg files are only used as dependencies. spark-submit still needs the driver code in .py. So it would look like this: spark-submit --py-files path/to/file.egg driver.py.
According to the doc itself,
For Python applications, simply pass a .py file in the place of <application-jar> instead of a JAR,
and add Python .zip, .egg or .py files to the search path with --py-files.
Can you provide more information? Maybe it's spark-sumit issue, not pyconcrete.