io
io copied to clipboard
tfio.IODataset.from_parquet load array element failed
test_data = pd.DataFrame({'a':[[1,2,3],[4,5,6]], 'b':['q','p']})
test_data.to_parquet('a.parquet')
another_data = tfio.experimental.IODataset.from_parquet('a.parquet').as_numpy_iterator()
another_data.next()
result is
OrderedDict([('a.list.item', 1), ('b', b'q')])
seems only load the first element of array
I have the same demand for this case, when I use tfio.IODataset.from_parquet to load same parquet file with array element, it always report "tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 0 of dimension 0 out of bounds. [Op:StridedSlice] name: IOFromParquet/ParquetIODataset/strided_slice/"
Same problem.
df = pd.DataFrame({
"scores": [[.1,.2,.3], [.4,.5,.6], [.7,.8,.9]],
})
df.to_parquet("test.parquet")
ds = tfio.IODataset.from_parquet("test.parquet")
for el in ds:
print(el)
print(ds.element_spec)
gives
OrderedDict([(b'scores.list.item', <tf.Tensor: shape=(), dtype=float64, numpy=0.1>)])
OrderedDict([(b'scores.list.item', <tf.Tensor: shape=(), dtype=float64, numpy=0.2>)])
OrderedDict([(b'scores.list.item', <tf.Tensor: shape=(), dtype=float64, numpy=0.3>)])
OrderedDict([(b'scores.list.item', TensorSpec(shape=(), dtype=tf.float64, name=None))])
tfio version is 0.24.0
+1 Same issue. Anyone found a resolution?
+1 same issue.
+1 same issue
+1 same issue
Having the same issue. I'd love to be using IODataset.from_parquet
instead of putting together a custom generator and creating a dataset from that. Any idea if/when this issue will be picked up?
+1 same issue. Has anyone found the right way to use it?
+1 same issue.
Since I am working in Databricks/PySpark, I will likely use the petastorm library to load the parquet files into a TensorFlow dataset using their make_petastorm_dataset()
wrapped around their make_batch_reader()
function. This is not really fixing the problem with tensorflow IO but could be an option for some of you. And I'll gladly be kept in the loop for a solution!
您好,邮件已收到。
seems this problem would not be solved currently!
Also facing this issue on 0.26.0, any updates when this will be fixed?
您好,邮件已收到。
+1
facing same with 0.31.0
您好,邮件已收到。
+1
+1
+1
+1