python-spark-tutorial icon indicating copy to clipboard operation
python-spark-tutorial copied to clipboard

No examples on integrating pandas

Open tappoz opened this issue 7 years ago • 0 comments

These tutorials could be improved with some tips/examples/workarounds on how to integrate python code from data scientists that are used to libraries like pandas and numpy. There is no example on how to include 3rd party libraries (from pip, conda etc.) into PySpark (a Java based environment) and how to make the 2 different APIs work together.

tappoz avatar Jun 19 '18 15:06 tappoz