ecosystem
ecosystem copied to clipboard
[spark-tensorflow-connector] Cannot read multiple TFRecord files
Using spark.read.format("tfrecord").load("path/to/one-file.tfrecord")
, works.
How do I read multiple directories with tfrecords in each?
I have tried:
spark.read.format("tfrecord").load(paths: _*)
, where paths is an array of paths.
spark.read.format("tfrecord").load(path)
, where path is a regex of tfrecords paths.
I have also tried using path as an option:
spark.read.format("tfrecord").option("path", path).load()
None of it works.
Is there a recommended way to do this?
The format is tfrecords
and both spark.read.format("tfrecords").load("path/to/*file.tfrecord")
and spark.read.format("tfrecords").load("path/to/one-file.tfrecord,path/to/another-file.tfrecord")
work for me
i find the reason, the directory do not look up recursive
Using
spark.read.format("tfrecord").load("path/to/one-file.tfrecord")
, works. How do I read multiple directories with tfrecords in each? I have tried:spark.read.format("tfrecord").load(paths: _*)
, where paths is an array of paths.spark.read.format("tfrecord").load(path)
, where path is a regex of tfrecords paths. I have also tried using path as an option:spark.read.format("tfrecord").option("path", path).load()
None of it works. Is there a recommended way to do this?