docker-zeppelin icon indicating copy to clipboard operation
docker-zeppelin copied to clipboard

Python Interpreter for parsing html/xml in zepplin

Open mkscala opened this issue 8 years ago • 4 comments

I wanted to use the Python APIs like BeautifulSoup, how can we use external python api's along with pyspark http://omz-software.com/pythonista/docs/ios/beautifulsoup_guide.html http://apache-spark-user-list.1001560.n3.nabble.com/How-to-consider-HTML-files-in-Spark-td22017.html https://pypi.python.org/pypi/beautifulsoup4

mkscala avatar Apr 13 '16 17:04 mkscala

%sh pip install beautifulsoup4

I don't know any details about actually using it effectively with Spark. Spark has a very active mailing list and freenode #apache-spark IRC channel which I'm sure will yield better tips.

dylanmei avatar Apr 13 '16 17:04 dylanmei

I get the below error Process exited with an error: 127 (Exit value: 127)

mkscala avatar Apr 13 '16 19:04 mkscala

how to add the basic python interpreter ? have you tried ? within this docker-zepplin?

mkscala avatar Apr 13 '16 19:04 mkscala

There is only the pyspark interpreter. Perhaps you'd get more mileage with this: https://github.com/jupyter/docker-stacks

dylanmei avatar Apr 13 '16 20:04 dylanmei