DataProfiler icon indicating copy to clipboard operation
DataProfiler copied to clipboard

Tensorflow error when trying to use labeler

Open afolga opened this issue 3 years ago • 9 comments

General Information:

  • OS: Windows 10 Enterprise
  • Python version: 3.9.12
  • Library version: downloaded 2 weeks ago

Describe the bug: I am getting a tensorflow error when trying to use the labeler on a csv file. this is the code: import dataprofiler as dp

load data and data labeler

data = dp.Data("DATASET FOR SPARK 02022022.csv") data_labeler = dp.DataLabeler(labeler_type='structured')

make predictions and get labels per cell

predictions = data_labeler.predict(data)

To Reproduce: I tried to just generate a report, and get this error:

/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/profilers/profile_builder.py:595: RuntimeWarning:

!!! WARNING Partial Profiler Failure !!!

Profiling Type: data_labeler Exception: TypeError Message: bases must be types

For labeler errors, try installing the extra ml requirements via:

$ pip install dataprofiler[ml] --user

utils.warn_on_profile('data_labeler', e)

I tried to uninstall and reinstall the profiler = did not work I tried to run the command mentioned in the error message, and it said requirement already installed.

Expected behavior:

Screenshots: I can't attach screenshots but this is the main error i get: TypeError Traceback (most recent call last) /tmp/ipykernel_2214706/1459032405.py in 3 # load data and data labeler 4 data = dp.Data("DATASET FOR SPARK 02022022.csv") ----> 5 data_labeler = dp.DataLabeler(labeler_type='structured') 6 7 # make predictions and get labels per cell

/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/data_labelers.py in new(cls, labeler_type, dirpath, load_options, trainable) 92 data_labeler._default_model_loc) 93 return TrainableDataLabeler(dirpath, load_options) ---> 94 return data_labeler(dirpath, load_options) 95 96 @classmethod

/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/base_data_labeler.py in init(self, dirpath, load_options) 45 dirpath = os.path.join(default_labeler_dir, 46 self._default_model_loc) ---> 47 self._load_data_labeler(dirpath, load_options) 48 49 def eq(self, other):

/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/base_data_labeler.py in _load_data_labeler(self, dirpath, load_options) 537 538 # setup data labeler based on parameters --> 539 self._load_model(model_params.get('class'), dirpath) 540 self._load_preprocessor(preprocessor_params.get('class'), dirpath) 541 self._load_postprocessor(postprocessor_params.get('class'), dirpath)

/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/base_data_labeler.py in _load_model(self, model_class, dirpath) 471 """ 472 if isinstance(model_class, str): --> 473 model_class = BaseModel.get_class(model_class) 474 if not model_class: 475 raise ValueError('model_class, {}, was not set in load_options '

/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/base_model.py in get_class(cls, class_name) 121 # Import possible internal models 122 from .regex_model import RegexModel --> 123 from .character_level_cnn_model import CharacterLevelCnnModel 124 125 return cls._BaseModel__subclasses.get(class_name.lower(), None)

/dsdata/miniconda3/lib/python3.8/site-packages/dataprofiler/labelers/character_level_cnn_model.py in 7 from collections import defaultdict 8 ----> 9 import tensorflow as tf 10 import numpy as np 11 from sklearn import decomposition

/dsdata/miniconda3/lib/python3.8/site-packages/tensorflow/init.py in 39 import sys as _sys 40 ---> 41 from tensorflow.python.tools import module_util as _module_util 42 from tensorflow.python.util.lazy_loader import LazyLoader as _LazyLoader 43

/dsdata/miniconda3/lib/python3.8/site-packages/tensorflow/python/init.py in 39 40 from tensorflow.python import pywrap_tensorflow as _pywrap_tensorflow ---> 41 from tensorflow.python.eager import context 42 43 # pylint: enable=wildcard-import

/dsdata/miniconda3/lib/python3.8/site-packages/tensorflow/python/eager/context.py in 31 import six 32 ---> 33 from tensorflow.core.framework import function_pb2 34 from tensorflow.core.protobuf import config_pb2 35 from tensorflow.core.protobuf import rewriter_config_pb2

/dsdata/miniconda3/lib/python3.8/site-packages/tensorflow/core/framework/function_pb2.py in 3 # source: tensorflow/core/framework/function.proto 4 """Generated protocol buffer code.""" ----> 5 from google.protobuf import descriptor as _descriptor 6 from google.protobuf import message as _message 7 from google.protobuf import reflection as _reflection

/dsdata/miniconda3/lib/python3.8/site-packages/google/protobuf/descriptor.py in 45 import binascii 46 import os ---> 47 from google.protobuf.pyext import _message 48 _USE_C_DESCRIPTORS = True 49

TypeError: bases must be types

Additional context:

I am just running the labeler and the profiler on CSV files when this happened today. My coworker downloaded the package today and is receiving the same errors. My email is [email protected]. Any insight or help would be greatly appreciated - thank you! partial_error example_code_with_error

afolga avatar Jun 22 '22 20:06 afolga

@afolga Do you know what version of TF was installed when you ran the package? Thanks!

JGSweets avatar Jun 22 '22 20:06 JGSweets

Is your coworker also using windows?

JGSweets avatar Jun 22 '22 20:06 JGSweets

Excuse my stupidity, but what is TF? and yes he is

afolga avatar Jun 22 '22 20:06 afolga

No worries, Tensorflow. pip freeze should give you a readout of what version of TF. If you are using a notebook: !pip freeze

JGSweets avatar Jun 22 '22 21:06 JGSweets

Ok, thanks. When I run it, I get this at the tensorflow spot: tensorflow @ file:///scratch/env/opence1.5.1/conda-bld/tensorflow-base_1642721082345/work/tensorflow_pkg/tensorflow-2.7.0-cp38-cp38-linux_ppc64le.whl tensorflow-datasets==4.4.0 tensorflow-estimator @ file:///scratch/env/opence1.5.1/conda-bld/tensorflow-estimator_1642799856821/work/tensorflow_estimator_pkg/tensorflow_estimator-2.7.0-py2.py3-none-any.whl tensorflow-hub @ file:///scratch/env/opence1.5.1/conda-bld/tensorflow-hub_1643062105665/work/tensorflow_hub_pkg/tensorflow_hub-0.12.0-py2.py3-none-any.whl tensorflow-io-gcs-filesystem @ file:///scratch/env/opence1.5.1/conda-bld/tensorflow-io-gcs-filesystem_1643068858253/work/dist/tensorflow_io_gcs_filesystem-0.23.1-cp38-cp38-linux_ppc64le.whl tensorflow-metadata==1.9.0 tensorflow-probability @ file:///home/conda/feedstock_root/build_artifacts/tensorflow-probability_1643225309959/work

afolga avatar Jun 22 '22 21:06 afolga

Can you try installing TF version 2.5 and seeing if that resolve the issue?

JGSweets avatar Jun 22 '22 21:06 JGSweets

Thanks, it seems like an issue with my environment at work. I appreciate it!!

afolga avatar Jun 23 '22 14:06 afolga

@afolga Does this mean you were able to resolve the issue?

JGSweets avatar Jun 23 '22 14:06 JGSweets

@afolga just bumping to make sure your issue is resolved?

taylorfturner avatar Aug 11 '22 12:08 taylorfturner

Inactivity for 2 months. Will close for now.

JGSweets avatar Aug 22 '22 19:08 JGSweets