python-spark-streaming icon indicating copy to clipboard operation
python-spark-streaming copied to clipboard

pass pass a range object to SparkContext parallelize

Open jazberna1 opened this issue 5 years ago • 0 comments

Hello Matthew,

I see in the join demo that you pass a range object to SparkContext.parallelize

https://github.com/jleetutorial/python-spark-streaming/blob/master/3_advanced/1_Stream-Stream%20Join%20Demo.ipynb

In the Spark documentation it says that SparkContext.parallelize takes an iterable or a collection.

https://spark.apache.org/docs/2.1.1/programming-guide.html#parallelized-collections

I am not quite sure that a range object is one of those (iterable orcollection) so I got surprised the example works. Actually I does not work for me and the resulting rdd seems empty.

Any comment on this would be appreciated

Thanks

Jorge

jazberna1 avatar Apr 14 '19 08:04 jazberna1