tfx icon indicating copy to clipboard operation
tfx copied to clipboard

fix colab notebook for tfx - downgrade apache-beam

Open mygithubid1 opened this issue 2 years ago • 1 comments

System information

  • Have I specified the code to reproduce the issue (Yes, No):

The code for this issue comes from the sample notebook in this document page.

  • Environment in which the code is executed (e.g., Local(Linux/MacOS/Windows), Interactive Notebook, Google Cloud, etc):

Google colab

  • TensorFlow version:

2.8.2

  • TFX Version:

1.8.0

  • Python version:

3.7.13

  • Python dependencies (from pip freeze output):
absl-py==1.1.0
alabaster==0.7.12
albumentations==0.1.12
altair==4.2.0
apache-beam==2.40.0
appdirs==1.4.4
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arviz==0.12.1
astor==0.8.1
astropy==4.3.1
astunparse==1.6.3
atari-py==0.2.9
atomicwrites==1.4.0
attrs==20.3.0
audioread==2.1.9
autograd==1.4
Babel==2.10.2
backcall==0.2.0
beautifulsoup4==4.6.3
bleach==5.0.0
blis==0.7.7
bokeh==2.3.3
branca==0.5.0
bs4==0.0.1
CacheControl==0.12.11
cached-property==1.5.2
cachetools==4.2.4
catalogue==2.0.7
certifi==2022.6.15
cffi==1.15.0
cftime==1.6.0
chardet==3.0.4
charset-normalizer==2.1.0
click==7.1.2
cloudpickle==2.1.0
cmake==3.22.5
cmdstanpy==0.9.5
colorcet==3.0.0
colorlover==0.3.0
community==1.0.0b1
contextlib2==0.5.5
convertdate==2.4.0
coverage==3.7.1
coveralls==0.5
crcmod==1.7
cufflinks==0.17.3
cupy-cuda111==9.4.0
cvxopt==1.2.7
cvxpy==1.0.31
cycler==0.11.0
cymem==2.0.6
Cython==0.29.30
daft==0.0.4
dask==2.12.0
datascience==0.10.6
debugpy==1.0.0
decorator==4.4.2
defusedxml==0.7.1
descartes==1.1.0
dill==0.3.1.1
distributed==1.25.3
dlib==19.18.0+zzzcolab20220513001918
dm-tree==0.1.7
docker==4.4.4
docopt==0.6.2
docutils==0.17.1
dopamine-rl==1.0.5
earthengine-api==0.1.315
easydict==1.9
ecos==2.0.10
editdistance==0.5.3
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.3.0/en_core_web_sm-3.3.0-py3-none-any.whl
entrypoints==0.4
ephem==4.1.3
et-xmlfile==1.1.0
fa2==0.3.5
fastai==2.6.3
fastavro==1.5.2
fastcore==1.4.4
fastdownload==0.0.6
fastdtw==0.3.4
fasteners==0.17.3
fastjsonschema==2.15.3
fastprogress==1.0.2
fastrlock==0.8
fbprophet==0.7.1
feather-format==0.4.1
filelock==3.7.1
firebase-admin==4.4.0
fix-yahoo-finance==0.0.22
Flask==1.1.4
flatbuffers==2.0
folium==0.8.3
future==0.16.0
gast==0.5.3
GDAL==2.2.2
gdown==4.4.0
gensim==3.6.0
geographiclib==1.52
geopy==1.17.0
gin-config==0.5.0
glob2==0.7
google==2.0.3
google-api-core==1.31.6
google-api-python-client==1.12.11
google-apitools==0.5.31
google-auth==1.35.0
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.4.6
google-cloud-aiplatform==1.15.0
google-cloud-bigquery==2.34.4
google-cloud-bigquery-storage==2.13.2
google-cloud-bigtable==1.7.2
google-cloud-core==1.7.2
google-cloud-datastore==1.8.0
google-cloud-dlp==3.7.1
google-cloud-firestore==1.7.0
google-cloud-language==1.3.2
google-cloud-pubsub==2.13.1
google-cloud-pubsublite==1.4.2
google-cloud-recommendations-ai==0.2.0
google-cloud-resource-manager==1.5.1
google-cloud-spanner==1.19.3
google-cloud-storage==2.2.1
google-cloud-translate==1.5.0
google-cloud-videointelligence==1.16.3
google-cloud-vision==1.0.2
google-colab @ file:///colabtools/dist/google-colab-1.0.0.tar.gz
google-crc32c==1.3.0
google-pasta==0.2.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.2
googledrivedownloader==0.4
graphviz==0.10.1
greenlet==1.1.2
grpc-google-iam-v1==0.12.4
grpcio==1.47.0
grpcio-gcp==0.2.2
grpcio-status==1.47.0
gspread==3.4.2
gspread-dataframe==3.0.8
gym==0.17.3
h5py==3.1.0
hdfs==2.7.0
HeapDict==1.0.1
hijri-converter==2.2.4
holidays==0.10.5.2
holoviews==1.14.9
html5lib==1.0.1
httpimport==0.5.18
httplib2==0.17.4
httplib2shim==0.0.3
humanize==0.5.1
hyperopt==0.1.2
ideep4py==2.0.0.post3
idna==2.10
imageio==2.4.1
imagesize==1.3.0
imbalanced-learn==0.8.1
imblearn==0.0
imgaug==0.2.9
importlib-metadata==4.11.4
importlib-resources==5.7.1
imutils==0.5.4
inflect==2.1.0
iniconfig==1.1.1
intel-openmp==2022.1.0
intervaltree==2.1.0
ipykernel==4.10.1
ipython==7.34.0
ipython-genutils==0.2.0
ipython-sql==0.3.9
ipywidgets==7.7.0
itsdangerous==1.1.0
jax==0.3.8
jaxlib @ https://storage.googleapis.com/jax-releases/cuda11/jaxlib-0.3.7+cuda11.cudnn805-cp37-none-manylinux2014_x86_64.whl
jedi==0.18.1
jieba==0.42.1
Jinja2==2.11.3
joblib==0.14.1
jpeg4py==0.1.4
jsonschema==4.3.3
jupyter==1.0.0
jupyter-client==5.3.5
jupyter-console==5.2.0
jupyter-core==4.10.0
jupyterlab-pygments==0.2.2
jupyterlab-widgets==1.1.0
kaggle==1.5.12
kapre==0.3.7
keras==2.8.0
Keras-Preprocessing==1.1.2
keras-tuner==1.1.2
keras-vis==0.4.1
kiwisolver==1.4.3
korean-lunar-calendar==0.2.1
kt-legacy==1.0.4
kubernetes==12.0.1
langcodes==3.3.0
libclang==14.0.1
librosa==0.8.1
lightgbm==2.2.3
llvmlite==0.34.0
lmdb==0.99
LunarCalendar==0.0.9
lxml==4.2.6
Markdown==3.3.7
MarkupSafe==2.0.1
matplotlib==3.2.2
matplotlib-inline==0.1.3
matplotlib-venn==0.11.7
missingno==0.5.1
mistune==0.8.4
mizani==0.6.0
mkl==2019.0
ml-metadata==1.8.0
ml-pipelines-sdk==1.8.0
mlxtend==0.14.0
more-itertools==8.13.0
moviepy==0.2.3.5
mpmath==1.2.1
msgpack==1.0.4
multiprocess==0.70.13
multitasking==0.0.10
murmurhash==1.0.7
music21==5.5.0
natsort==5.5.0
nbclient==0.6.6
nbconvert==5.6.1
nbformat==5.4.0
nest-asyncio==1.5.5
netCDF4==1.5.8
networkx==2.6.3
nibabel==3.0.2
nltk==3.7
notebook==5.3.1
numba==0.51.2
numexpr==2.8.1
numpy==1.21.6
oauth2client==4.1.3
oauthlib==3.2.0
okgrade==0.4.3
opencv-contrib-python==4.1.2.30
opencv-python==4.1.2.30
openpyxl==3.0.10
opt-einsum==3.3.0
orjson==3.7.7
osqp==0.6.2.post0
overrides==6.1.0
packaging==20.9
palettable==3.3.0
pandas==1.3.5
pandas-datareader==0.9.0
pandas-gbq==0.13.3
pandas-profiling==1.4.1
pandocfilters==1.5.0
panel==0.12.1
param==1.12.1
parso==0.8.3
pathlib==1.0.1
pathy==0.6.1
patsy==0.5.2
pep517==0.12.0
pexpect==4.8.0
pickleshare==0.7.5
Pillow==7.1.2
pip-tools==6.2.0
plotly==5.5.0
plotnine==0.6.0
pluggy==0.7.1
pooch==1.6.0
portpicker==1.3.9
prefetch-generator==1.0.1
preshed==3.0.6
prettytable==3.3.0
progressbar2==3.38.0
prometheus-client==0.14.1
promise==2.3
prompt-toolkit==3.0.30
proto-plus==1.20.6
protobuf==3.19.4
psutil==5.4.8
psycopg2==2.7.6.1
ptyprocess==0.7.0
py==1.11.0
pyarrow==5.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycocotools==2.0.4
pycparser==2.21
pyct==0.4.8
pydantic==1.8.2
pydata-google-auth==1.4.0
pydot==1.3.0
pydot-ng==2.0.0
pydotplus==2.0.2
PyDrive==1.3.1
pyemd==0.5.1
pyerfa==2.0.0.1
pyfarmhash==0.3.2
pyglet==1.5.0
Pygments==2.6.1
pygobject==3.26.1
pymc3==3.11.4
PyMeeus==0.5.11
pymongo==3.12.3
pymystem3==0.2.0
PyOpenGL==3.1.6
pyparsing==3.0.9
pyrsistent==0.18.1
pysndfile==1.3.8
PySocks==1.7.1
pystan==2.19.1.1
pytest==3.6.4
python-apt==0.0.0
python-chess==0.23.11
python-dateutil==2.8.2
python-louvain==0.16
python-slugify==6.1.2
python-utils==3.3.3
pytz==2022.1
pyviz-comms==2.2.0
PyWavelets==1.3.0
PyYAML==3.13
pyzmq==23.1.0
qdldl==0.1.5.post2
qtconsole==5.3.1
QtPy==2.1.0
regex==2022.6.2
requests==2.28.1
requests-oauthlib==1.3.1
resampy==0.2.2
rpy2==3.4.5
rsa==4.8
scikit-image==0.18.3
scikit-learn==1.0.2
scipy==1.4.1
screen-resolution-extra==0.0.0
scs==3.2.0
seaborn==0.11.2
semver==2.13.0
Send2Trash==1.8.0
setuptools-git==1.2
Shapely==1.8.2
simplegeneric==0.8.1
six==1.15.0
sklearn==0.0
sklearn-pandas==1.8.0
smart-open==5.2.1
snowballstemmer==2.2.0
sortedcontainers==2.4.0
SoundFile==0.10.3.post1
soupsieve==2.3.2.post1
spacy==3.3.1
spacy-legacy==3.0.9
spacy-loggers==1.0.2
Sphinx==1.8.6
sphinxcontrib-serializinghtml==1.1.5
sphinxcontrib-websupport==1.2.4
SQLAlchemy==1.4.37
sqlparse==0.4.2
srsly==2.4.3
statsmodels==0.10.2
sympy==1.7.1
tables==3.7.0
tabulate==0.8.9
tblib==1.7.0
tenacity==8.0.1
tensorboard==2.8.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.8.2+zzzcolab20220527125636
tensorflow-data-validation==1.8.0
tensorflow-datasets==4.0.1
tensorflow-estimator==2.8.0
tensorflow-gcs-config==2.8.0
tensorflow-hub==0.12.0
tensorflow-io-gcs-filesystem==0.26.0
tensorflow-metadata==1.8.0
tensorflow-model-analysis==0.39.0
tensorflow-probability==0.16.0
tensorflow-serving-api==2.8.2
tensorflow-transform==1.8.0
termcolor==1.1.0
terminado==0.13.3
testpath==0.6.0
text-unidecode==1.3
textblob==0.15.3
tfx==1.8.0
tfx-bsl==1.8.0
Theano-PyMC==1.1.2
thinc==8.0.17
threadpoolctl==3.1.0
tifffile==2021.11.2
tinycss2==1.1.1
tomli==2.0.1
toolz==0.11.2
torch @ https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp37-cp37m-linux_x86_64.whl
torchaudio @ https://download.pytorch.org/whl/cu113/torchaudio-0.11.0%2Bcu113-cp37-cp37m-linux_x86_64.whl
torchsummary==1.5.1
torchtext==0.12.0
torchvision @ https://download.pytorch.org/whl/cu113/torchvision-0.12.0%2Bcu113-cp37-cp37m-linux_x86_64.whl
tornado==5.1.1
tqdm==4.64.0
traitlets==5.1.1
tweepy==3.10.0
typeguard==2.7.1
typer==0.4.1
typing-extensions==4.1.1
typing-utils==0.1.0
tzlocal==1.5.1
uritemplate==3.0.1
urllib3==1.24.3
vega-datasets==0.9.0
wasabi==0.9.1
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==1.3.3
Werkzeug==1.0.1
widgetsnbextension==3.6.0
wordcloud==1.5.0
wrapt==1.14.1
xarray==0.20.2
xarray-einstats==0.2.2
xgboost==0.90
xkit==0.0.0
xlrd==1.1.0
xlwt==1.3.0
yellowbrick==1.4
zict==2.2.0
zipp==3.8.0

Describe the current behavior

Genrating statistics should work without generating error(s).

Describe the expected behavior

No errror should be generated and statistics should be returned when context runs statistics generator.

Standalone code to reproduce the issue

statistics_gen = tfx.components.StatisticsGen(
    examples=example_gen.outputs['examples'])
context.run(statistics_gen, enable_cache=True)

Providing a bare minimum test case or step(s) to reproduce the problem will greatly help us to debug the issue. If possible, please share a link to Colab/Jupyter/any notebook.

Notebook: https://colab.research.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/tfx/components_keras.ipynb

Name of your Organization (Optional)

Other info / logs

Please install this version to fix this issue: !pip install -U apache-beam==2.39.0

Here's a related ticket: https://github.com/tensorflow/data-validation/issues/217

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. issue

mygithubid1 avatar Jul 08 '22 16:07 mygithubid1

Thank you so much for creating this issue! Actually the issue was resolved in the latest version of tensorflow-data-validation and tfx(1.9.0) that released a few days ago. I've confirmed that colab works well because it always installs the latest version.

jiyongjung0 avatar Jul 21 '22 04:07 jiyongjung0

@mygithubid1,

Can you please take a look at the above comment by @jiyongjung0 and let us know if your issue is resolved. Thank you!

singhniraj08 avatar Aug 18 '22 07:08 singhniraj08

Things work now. Thanks.

mygithubid1 avatar Aug 18 '22 10:08 mygithubid1

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar Aug 18 '22 10:08 google-ml-butler[bot]