net.jgp.books.spark.ch09 icon indicating copy to clipboard operation
net.jgp.books.spark.ch09 copied to clipboard

Datasource example for Python

Open purna344 opened this issue 5 years ago • 4 comments
trafficstars

I am following the datasource example and noticed that no code for python.

Please let me know how to access Scala short name datasource from the python code!

https://github.com/jgperrin/net.jgp.books.spark.ch09/blob/master/src/main/python/lab400_photo_datasource/photoMetadataIngestionApp.py => this file is empty

purna344 avatar Oct 15 '20 21:10 purna344

Hey @purna344, thanks for getting in touch... The Python and Scala code is supported by @rambabu-posa, so I'll let Ram chime in on this one. A quick note though:

  1. You can probably not write your data source in Python (but I am not 100% sure on that, as my Python is pretty rustic).
  2. You should be able to use the short name in the python code, but you will have dependencies on a Jar containing your Java/Scala code.

@rambabu-posa - can you add to that and help @purna344 ?

jgperrin avatar Oct 16 '20 17:10 jgperrin

Thanks @jgperrin Hi @purna344, As I felt implementing custom datasource in PySpark is bit complex, I left that file as empty. I will try to work on it again and update you.

rambabu-posa avatar Oct 17 '20 06:10 rambabu-posa

@rambabu-posa can you provide an update or close the issue?

jgperrin avatar Jun 13 '21 15:06 jgperrin

@jgperrin its bit tough to implement.

rambabu-posa avatar Jun 14 '21 09:06 rambabu-posa