python-spark-streaming
python-spark-streaming copied to clipboard
pass pass a range object to SparkContext parallelize
Hello Matthew,
I see in the join demo that you pass a range object to SparkContext.parallelize
https://github.com/jleetutorial/python-spark-streaming/blob/master/3_advanced/1_Stream-Stream%20Join%20Demo.ipynb
In the Spark documentation it says that SparkContext.parallelize takes an iterable or a collection.
https://spark.apache.org/docs/2.1.1/programming-guide.html#parallelized-collections
I am not quite sure that a range object is one of those (iterable orcollection) so I got surprised the example works. Actually I does not work for me and the resulting rdd seems empty.
Any comment on this would be appreciated
Thanks
Jorge