DataProfiler
DataProfiler copied to clipboard
Tensorflow error when trying to use labeler
General Information:
- OS: Windows 10 Enterprise
- Python version: 3.9.12
- Library version: downloaded 2 weeks ago
Describe the bug: I am getting a tensorflow error when trying to use the labeler on a csv file. this is the code: import dataprofiler as dp
load data and data labeler
data = dp.Data("DATASET FOR SPARK 02022022.csv") data_labeler = dp.DataLabeler(labeler_type='structured')
make predictions and get labels per cell
predictions = data_labeler.predict(data)
To Reproduce: I tried to just generate a report, and get this error:
/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/profilers/profile_builder.py:595: RuntimeWarning:
!!! WARNING Partial Profiler Failure !!!
Profiling Type: data_labeler Exception: TypeError Message: bases must be types
For labeler errors, try installing the extra ml requirements via:
$ pip install dataprofiler[ml] --user
utils.warn_on_profile('data_labeler', e)
I tried to uninstall and reinstall the profiler = did not work I tried to run the command mentioned in the error message, and it said requirement already installed.
Expected behavior:
Screenshots:
I can't attach screenshots but this is the main error i get:
TypeError Traceback (most recent call last)
/tmp/ipykernel_2214706/1459032405.py in
/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/data_labelers.py in new(cls, labeler_type, dirpath, load_options, trainable) 92 data_labeler._default_model_loc) 93 return TrainableDataLabeler(dirpath, load_options) ---> 94 return data_labeler(dirpath, load_options) 95 96 @classmethod
/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/base_data_labeler.py in init(self, dirpath, load_options) 45 dirpath = os.path.join(default_labeler_dir, 46 self._default_model_loc) ---> 47 self._load_data_labeler(dirpath, load_options) 48 49 def eq(self, other):
/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/base_data_labeler.py in _load_data_labeler(self, dirpath, load_options) 537 538 # setup data labeler based on parameters --> 539 self._load_model(model_params.get('class'), dirpath) 540 self._load_preprocessor(preprocessor_params.get('class'), dirpath) 541 self._load_postprocessor(postprocessor_params.get('class'), dirpath)
/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/base_data_labeler.py in _load_model(self, model_class, dirpath)
471 """
472 if isinstance(model_class, str):
--> 473 model_class = BaseModel.get_class(model_class)
474 if not model_class:
475 raise ValueError('model_class, {}, was not set in load_options '
/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/base_model.py in get_class(cls, class_name) 121 # Import possible internal models 122 from .regex_model import RegexModel --> 123 from .character_level_cnn_model import CharacterLevelCnnModel 124 125 return cls._BaseModel__subclasses.get(class_name.lower(), None)
/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/character_level_cnn_model.py in
/dsdata/miniconda3/lib/python3.8/site-packages/tensorflow/init.py in
/dsdata/miniconda3/lib/python3.8/site-packages/tensorflow/python/init.py in
/dsdata/miniconda3/lib/python3.8/site-packages/tensorflow/python/eager/context.py in
/dsdata/miniconda3/lib/python3.8/site-packages/tensorflow/core/framework/function_pb2.py in
/dsdata/miniconda3/lib/python3.8/site-packages/google/protobuf/descriptor.py in
TypeError: bases must be types
Additional context:
I am just running the labeler and the profiler on CSV files when this happened today. My coworker downloaded the package today and is receiving the same errors. My email is [email protected].
Any insight or help would be greatly appreciated - thank you!

@afolga Do you know what version of TF was installed when you ran the package? Thanks!
Is your coworker also using windows?
Excuse my stupidity, but what is TF? and yes he is
No worries, Tensorflow.
pip freeze should give you a readout of what version of TF.
If you are using a notebook:
!pip freeze
Ok, thanks. When I run it, I get this at the tensorflow spot: tensorflow @ file:///scratch/env/opence1.5.1/conda-bld/tensorflow-base_1642721082345/work/tensorflow_pkg/tensorflow-2.7.0-cp38-cp38-linux_ppc64le.whl tensorflow-datasets==4.4.0 tensorflow-estimator @ file:///scratch/env/opence1.5.1/conda-bld/tensorflow-estimator_1642799856821/work/tensorflow_estimator_pkg/tensorflow_estimator-2.7.0-py2.py3-none-any.whl tensorflow-hub @ file:///scratch/env/opence1.5.1/conda-bld/tensorflow-hub_1643062105665/work/tensorflow_hub_pkg/tensorflow_hub-0.12.0-py2.py3-none-any.whl tensorflow-io-gcs-filesystem @ file:///scratch/env/opence1.5.1/conda-bld/tensorflow-io-gcs-filesystem_1643068858253/work/dist/tensorflow_io_gcs_filesystem-0.23.1-cp38-cp38-linux_ppc64le.whl tensorflow-metadata==1.9.0 tensorflow-probability @ file:///home/conda/feedstock_root/build_artifacts/tensorflow-probability_1643225309959/work
Can you try installing TF version 2.5 and seeing if that resolve the issue?
Thanks, it seems like an issue with my environment at work. I appreciate it!!
@afolga Does this mean you were able to resolve the issue?
@afolga just bumping to make sure your issue is resolved?
Inactivity for 2 months. Will close for now.