data-validation
data-validation copied to clipboard
Library for exploring and validating machine learning data
Towards the goal of adding support for computing statistics over structured data (e.g., arbitrary protocol buffers, parquet data), we will populate [`path`](https://github.com/tensorflow/metadata/blob/master/tensorflow_metadata/proto/v0/statistics.proto#L105) for each feature instead of [`name`](https://github.com/tensorflow/metadata/blob/master/tensorflow_metadata/proto/v0/statistics.proto#L102) in the...
Does TFDV support reading tf.SequenceExample from TFRecords, inferring a schema over them and compute statistics from them?
Hi Paul, As discussed on SO please find below my feedback on what I think would be nice additions to this already great library: 1) possibility to merge generated stats:...
Hi, Problem: - I am unable to use tfdv with poetry due to dependencies not being resolved. For simplicity and debugging purposes, below are the steps to recreate the issues:...
The package joblib from version 0 and before 1.2.0 are vulnerable to Arbitrary Code Execution via the pre_dispatch flag in Parallel() class due to the eval() statement. My PR: [https://github.com/tensorflow/data-validation/pull/225...
The package joblib from 0 and before 1.2.0 are vulnerable to Arbitrary Code Execution via the pre_dispatch flag in Parallel() class due to the eval() statement.
Hello, I am trying to display the schema and statistics in a dashboard/UI using streamlit. I have incorporated the code in a .py file. But the visuals are coming in...
I've been looking through the detectable [anomalies](https://github.com/tensorflow/metadata/blob/master/tensorflow_metadata/proto/v0/anomalies.proto) and realized that I don't think there's a way to accomplish what I'd like to accomplish, which is enforce a distribution constraint on...
Fixed Install NumPy instructions link to "https://numpy.org/install/"