data-validation
data-validation copied to clipboard
planned support for hive table?
@zhaiyuyong TFDV uses Apache Beam for reading input data. Beam Python currently doesn't support reading Hive table out of the box. There are two possible options currently:
- Export your hive table as a CSV/tfrecord file and then use TFDV.
- Write a custom Beam transform to read hive table and decode it to TFDV's inmemory dictionary representation. Follow the instructions here to construct a pipeline with a custom decoder.
@katsiapis @aaltay