ecosystem icon indicating copy to clipboard operation
ecosystem copied to clipboard

Does Spark tf connect support in memory transform?

Open darouwan opened this issue 6 years ago • 1 comments

I have a lot of data in parquet format on hdfs, which I want to use for tensorflow. Currently, I can read parquet files into spark and output the dataframe to tfrecord as temp files ready for tensorflow, however, it costs both time and space. Is there any method to read in parquet files and make it available for tensorflow directly?

darouwan avatar Sep 25 '18 07:09 darouwan

Nope, but I think it makes sense to add support for that to TensorFlow, so contributions are welcome. It should be as a tf.data.Dataset. @mrry FYI

jhseu avatar Sep 25 '18 21:09 jhseu