sparkflow icon indicating copy to clipboard operation
sparkflow copied to clipboard

Any plans on supporting Time series data

Open amithadiraju1694 opened this issue 4 years ago • 0 comments

@dmmiller612 Sparkflow is a great API ! Thanks for the super work. Are there any plans to developing a TimeSeriesGenerator wrapper similar to this one: Tensorflow TimeSeriesGenerator

I have a huge data frame (>60GB) on pyspark, which I want to model using a Time Series approach, by using a similar API as above, I obviously couldn't use a .collect() ; and none of the partition functions from pyspark seems to partition data sequentially. I looked around for spark based time series , even though I found this: Spark-TS ; docs for python are corrupted and there's no clear and easy way to transform data into time series one. It would be super helpful SparkFlow could support TimeSeries data in the future. :)

amithadiraju1694 avatar Jun 04 '20 18:06 amithadiraju1694