DataProfiler
DataProfiler copied to clipboard
What's in your data? Extract schema, statistics and entities from datasets
Investigate whether these instances of `# type: ignore` are needed. PRs should be small (at most a few files). profilers/base_column_profilers.py - https://github.com/capitalone/DataProfiler/blob/main/dataprofiler/profilers/base_column_profilers.py#L20 - https://github.com/capitalone/DataProfiler/blob/main/dataprofiler/profilers/base_column_profilers.py#L255 profilers/column_profile_compilers.py - https://github.com/capitalone/DataProfiler/blob/main/dataprofiler/profilers/column_profile_compilers.py#L25 - https://github.com/capitalone/DataProfiler/blob/main/dataprofiler/profilers/column_profile_compilers.py#L31 profilers/data_labeler_column_profile.py...
How about this as a starting point for adding some helpful printing of Data Profiler options? ``` import dataprofiler as dp profile_options = dp.ProfilerOptions() print(profile_options) ```
**General Information:** - OS: `linux/x86_64` - Python version: `3.10.14` - Library version: `DataProfiler==0.10.8` **Describe the bug:** On line: https://github.com/capitalone/DataProfiler/blob/f8b3e5dbd4b76f0ecc291911ace9e8e21cf1ecb1/dataprofiler/labelers/labeler_utils.py#L360 I receive the error: `TypeError: Metric.add_weight() got multiple values for argument...
**General Information:** - OS: `Ubuntu 22.04` - Python version: `3.10.12` - Library version: `0.10.9` **Describe the bug:** I have a parquet file column `org_number` that should be treated as `text`...
**General Information:** - OS: Linux - Python version: 3.9.18 - Library version: 0.10.9 **Describe the bug:** Dask change a couple things subsequent to Feb 9, 2024. We have had to...
This is related to some of the discussion in #1098 In my testing, I have a single dataset. I am running this in a Docker container. I'm running with the...
I attempted to create my own custom labeler by using the transfer learning example in the documentation. I attempted to add three labels to the labeler: Name, Datetime (which is...
**General Information:** - OS: Sonoma 14.5 - Python version: 3.9 - Library version: 0.12.0 **Describe the bug:** The bug will produce the following in the final line of the output:...
Vulnerable Library - urllib3-1.26.18-py2.py3-none-any.whl HTTP library with thread-safe connection pooling, file post, and more. Library home page: https://files.pythonhosted.org/packages/b0/53/aa91e163dcfd1e5b82d8a890ecf13314e3e149c05270cc644581f77f17fd/urllib3-1.26.18-py2.py3-none-any.whl Path to dependency file: /requirements.txt Path to vulnerable library: /requirements.txt ## Vulnerabilities...