data-validation icon indicating copy to clipboard operation
data-validation copied to clipboard

tfdv.validate_tensor_examples()?

Open schmidt-jake opened this issue 5 years ago • 4 comments

I think it would be nice to have a top-level function to check for anomalies in serving data. It could be integrated into serving_input_receiver_fn. It doesn't make sense to have to write serving data to tfrecords just to make use of tfdv.generate_statistics_from_tfrecord.

schmidt-jake avatar Jan 30 '20 18:01 schmidt-jake

@JakeTheWise, We currently, can check if Serving Data has Anomalies using the below code:

serving_stats = tfdv.generate_statistics_from_tfrecord(data_location=serving_data_path)
serving_anomalies = tfdv.validate_statistics(serving_stats, schema)

For more information, please refer this TFDV Page.

Please let me know if this is what you are looking for, or if your proposal is different. Thanks!

rmothukuru avatar Jan 31 '20 06:01 rmothukuru

@rmothukuru Yes I was aware of generate_statistics_from_tfrecord (and I'll update the original post). Why do I need to write serving data to tfrecords just to check for anomalies?

schmidt-jake avatar Jan 31 '20 14:01 schmidt-jake

Hi Jake -- Are you looking to validate your serving examples one at a time?

caveness avatar Jan 31 '20 16:01 caveness

In the limit, yes. Serving examples could come in one at a time, or in batches. Thus the idea to have a tensor-based example validator, as the serving environment (tensorflow model server) / model export signature function will receive examples as (dictionaries of) tensor objects.

schmidt-jake avatar Jan 31 '20 17:01 schmidt-jake