asammdf icon indicating copy to clipboard operation
asammdf copied to clipboard

asammdf - pyspark integration

Open Spratiher9 opened this issue 2 years ago • 4 comments

Bring support for pyspark integration with asammdf.

Spratiher9 avatar Nov 10 '21 18:11 Spratiher9

How would it work?

danielhrisca avatar Nov 16 '21 06:11 danielhrisca

I would give a list of paths of MDF files I want to execute on all the MDF files my python logic in parallel using Spark

Spratiher9 avatar Nov 16 '21 13:11 Spratiher9

@Spratiher9 spark to deal with mdf files ,can you show me your code?

shangrilaer avatar Dec 07 '21 13:12 shangrilaer

@shangrilaer For now What I do is have a spark data frame with just one column which contains the paths of the mdf files and then basically do

def f(row):
  '''
  some logic involving 
  python code asammdf
  which basically does 
  parallel processing on
  MDF files parallelly on 
  executors
  '''

df.rdd.map(lambda row: f(row))

Spratiher9 avatar Dec 07 '21 18:12 Spratiher9