ecosystem
ecosystem copied to clipboard
Does Spark tf connect support in memory transform?
I have a lot of data in parquet format on hdfs, which I want to use for tensorflow. Currently, I can read parquet files into spark and output the dataframe to tfrecord as temp files ready for tensorflow, however, it costs both time and space. Is there any method to read in parquet files and make it available for tensorflow directly?
Nope, but I think it makes sense to add support for that to TensorFlow, so contributions are welcome. It should be as a tf.data.Dataset. @mrry FYI