tfx
tfx copied to clipboard
fix colab notebook for tfx - downgrade apache-beam
System information
- Have I specified the code to reproduce the issue (Yes, No):
The code for this issue comes from the sample notebook in this document page.
- Environment in which the code is executed (e.g., Local(Linux/MacOS/Windows), Interactive Notebook, Google Cloud, etc):
Google colab
- TensorFlow version:
2.8.2
- TFX Version:
1.8.0
- Python version:
3.7.13
- Python dependencies (from
pip freeze
output):
absl-py==1.1.0
alabaster==0.7.12
albumentations==0.1.12
altair==4.2.0
apache-beam==2.40.0
appdirs==1.4.4
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arviz==0.12.1
astor==0.8.1
astropy==4.3.1
astunparse==1.6.3
atari-py==0.2.9
atomicwrites==1.4.0
attrs==20.3.0
audioread==2.1.9
autograd==1.4
Babel==2.10.2
backcall==0.2.0
beautifulsoup4==4.6.3
bleach==5.0.0
blis==0.7.7
bokeh==2.3.3
branca==0.5.0
bs4==0.0.1
CacheControl==0.12.11
cached-property==1.5.2
cachetools==4.2.4
catalogue==2.0.7
certifi==2022.6.15
cffi==1.15.0
cftime==1.6.0
chardet==3.0.4
charset-normalizer==2.1.0
click==7.1.2
cloudpickle==2.1.0
cmake==3.22.5
cmdstanpy==0.9.5
colorcet==3.0.0
colorlover==0.3.0
community==1.0.0b1
contextlib2==0.5.5
convertdate==2.4.0
coverage==3.7.1
coveralls==0.5
crcmod==1.7
cufflinks==0.17.3
cupy-cuda111==9.4.0
cvxopt==1.2.7
cvxpy==1.0.31
cycler==0.11.0
cymem==2.0.6
Cython==0.29.30
daft==0.0.4
dask==2.12.0
datascience==0.10.6
debugpy==1.0.0
decorator==4.4.2
defusedxml==0.7.1
descartes==1.1.0
dill==0.3.1.1
distributed==1.25.3
dlib==19.18.0+zzzcolab20220513001918
dm-tree==0.1.7
docker==4.4.4
docopt==0.6.2
docutils==0.17.1
dopamine-rl==1.0.5
earthengine-api==0.1.315
easydict==1.9
ecos==2.0.10
editdistance==0.5.3
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.3.0/en_core_web_sm-3.3.0-py3-none-any.whl
entrypoints==0.4
ephem==4.1.3
et-xmlfile==1.1.0
fa2==0.3.5
fastai==2.6.3
fastavro==1.5.2
fastcore==1.4.4
fastdownload==0.0.6
fastdtw==0.3.4
fasteners==0.17.3
fastjsonschema==2.15.3
fastprogress==1.0.2
fastrlock==0.8
fbprophet==0.7.1
feather-format==0.4.1
filelock==3.7.1
firebase-admin==4.4.0
fix-yahoo-finance==0.0.22
Flask==1.1.4
flatbuffers==2.0
folium==0.8.3
future==0.16.0
gast==0.5.3
GDAL==2.2.2
gdown==4.4.0
gensim==3.6.0
geographiclib==1.52
geopy==1.17.0
gin-config==0.5.0
glob2==0.7
google==2.0.3
google-api-core==1.31.6
google-api-python-client==1.12.11
google-apitools==0.5.31
google-auth==1.35.0
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.4.6
google-cloud-aiplatform==1.15.0
google-cloud-bigquery==2.34.4
google-cloud-bigquery-storage==2.13.2
google-cloud-bigtable==1.7.2
google-cloud-core==1.7.2
google-cloud-datastore==1.8.0
google-cloud-dlp==3.7.1
google-cloud-firestore==1.7.0
google-cloud-language==1.3.2
google-cloud-pubsub==2.13.1
google-cloud-pubsublite==1.4.2
google-cloud-recommendations-ai==0.2.0
google-cloud-resource-manager==1.5.1
google-cloud-spanner==1.19.3
google-cloud-storage==2.2.1
google-cloud-translate==1.5.0
google-cloud-videointelligence==1.16.3
google-cloud-vision==1.0.2
google-colab @ file:///colabtools/dist/google-colab-1.0.0.tar.gz
google-crc32c==1.3.0
google-pasta==0.2.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.2
googledrivedownloader==0.4
graphviz==0.10.1
greenlet==1.1.2
grpc-google-iam-v1==0.12.4
grpcio==1.47.0
grpcio-gcp==0.2.2
grpcio-status==1.47.0
gspread==3.4.2
gspread-dataframe==3.0.8
gym==0.17.3
h5py==3.1.0
hdfs==2.7.0
HeapDict==1.0.1
hijri-converter==2.2.4
holidays==0.10.5.2
holoviews==1.14.9
html5lib==1.0.1
httpimport==0.5.18
httplib2==0.17.4
httplib2shim==0.0.3
humanize==0.5.1
hyperopt==0.1.2
ideep4py==2.0.0.post3
idna==2.10
imageio==2.4.1
imagesize==1.3.0
imbalanced-learn==0.8.1
imblearn==0.0
imgaug==0.2.9
importlib-metadata==4.11.4
importlib-resources==5.7.1
imutils==0.5.4
inflect==2.1.0
iniconfig==1.1.1
intel-openmp==2022.1.0
intervaltree==2.1.0
ipykernel==4.10.1
ipython==7.34.0
ipython-genutils==0.2.0
ipython-sql==0.3.9
ipywidgets==7.7.0
itsdangerous==1.1.0
jax==0.3.8
jaxlib @ https://storage.googleapis.com/jax-releases/cuda11/jaxlib-0.3.7+cuda11.cudnn805-cp37-none-manylinux2014_x86_64.whl
jedi==0.18.1
jieba==0.42.1
Jinja2==2.11.3
joblib==0.14.1
jpeg4py==0.1.4
jsonschema==4.3.3
jupyter==1.0.0
jupyter-client==5.3.5
jupyter-console==5.2.0
jupyter-core==4.10.0
jupyterlab-pygments==0.2.2
jupyterlab-widgets==1.1.0
kaggle==1.5.12
kapre==0.3.7
keras==2.8.0
Keras-Preprocessing==1.1.2
keras-tuner==1.1.2
keras-vis==0.4.1
kiwisolver==1.4.3
korean-lunar-calendar==0.2.1
kt-legacy==1.0.4
kubernetes==12.0.1
langcodes==3.3.0
libclang==14.0.1
librosa==0.8.1
lightgbm==2.2.3
llvmlite==0.34.0
lmdb==0.99
LunarCalendar==0.0.9
lxml==4.2.6
Markdown==3.3.7
MarkupSafe==2.0.1
matplotlib==3.2.2
matplotlib-inline==0.1.3
matplotlib-venn==0.11.7
missingno==0.5.1
mistune==0.8.4
mizani==0.6.0
mkl==2019.0
ml-metadata==1.8.0
ml-pipelines-sdk==1.8.0
mlxtend==0.14.0
more-itertools==8.13.0
moviepy==0.2.3.5
mpmath==1.2.1
msgpack==1.0.4
multiprocess==0.70.13
multitasking==0.0.10
murmurhash==1.0.7
music21==5.5.0
natsort==5.5.0
nbclient==0.6.6
nbconvert==5.6.1
nbformat==5.4.0
nest-asyncio==1.5.5
netCDF4==1.5.8
networkx==2.6.3
nibabel==3.0.2
nltk==3.7
notebook==5.3.1
numba==0.51.2
numexpr==2.8.1
numpy==1.21.6
oauth2client==4.1.3
oauthlib==3.2.0
okgrade==0.4.3
opencv-contrib-python==4.1.2.30
opencv-python==4.1.2.30
openpyxl==3.0.10
opt-einsum==3.3.0
orjson==3.7.7
osqp==0.6.2.post0
overrides==6.1.0
packaging==20.9
palettable==3.3.0
pandas==1.3.5
pandas-datareader==0.9.0
pandas-gbq==0.13.3
pandas-profiling==1.4.1
pandocfilters==1.5.0
panel==0.12.1
param==1.12.1
parso==0.8.3
pathlib==1.0.1
pathy==0.6.1
patsy==0.5.2
pep517==0.12.0
pexpect==4.8.0
pickleshare==0.7.5
Pillow==7.1.2
pip-tools==6.2.0
plotly==5.5.0
plotnine==0.6.0
pluggy==0.7.1
pooch==1.6.0
portpicker==1.3.9
prefetch-generator==1.0.1
preshed==3.0.6
prettytable==3.3.0
progressbar2==3.38.0
prometheus-client==0.14.1
promise==2.3
prompt-toolkit==3.0.30
proto-plus==1.20.6
protobuf==3.19.4
psutil==5.4.8
psycopg2==2.7.6.1
ptyprocess==0.7.0
py==1.11.0
pyarrow==5.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycocotools==2.0.4
pycparser==2.21
pyct==0.4.8
pydantic==1.8.2
pydata-google-auth==1.4.0
pydot==1.3.0
pydot-ng==2.0.0
pydotplus==2.0.2
PyDrive==1.3.1
pyemd==0.5.1
pyerfa==2.0.0.1
pyfarmhash==0.3.2
pyglet==1.5.0
Pygments==2.6.1
pygobject==3.26.1
pymc3==3.11.4
PyMeeus==0.5.11
pymongo==3.12.3
pymystem3==0.2.0
PyOpenGL==3.1.6
pyparsing==3.0.9
pyrsistent==0.18.1
pysndfile==1.3.8
PySocks==1.7.1
pystan==2.19.1.1
pytest==3.6.4
python-apt==0.0.0
python-chess==0.23.11
python-dateutil==2.8.2
python-louvain==0.16
python-slugify==6.1.2
python-utils==3.3.3
pytz==2022.1
pyviz-comms==2.2.0
PyWavelets==1.3.0
PyYAML==3.13
pyzmq==23.1.0
qdldl==0.1.5.post2
qtconsole==5.3.1
QtPy==2.1.0
regex==2022.6.2
requests==2.28.1
requests-oauthlib==1.3.1
resampy==0.2.2
rpy2==3.4.5
rsa==4.8
scikit-image==0.18.3
scikit-learn==1.0.2
scipy==1.4.1
screen-resolution-extra==0.0.0
scs==3.2.0
seaborn==0.11.2
semver==2.13.0
Send2Trash==1.8.0
setuptools-git==1.2
Shapely==1.8.2
simplegeneric==0.8.1
six==1.15.0
sklearn==0.0
sklearn-pandas==1.8.0
smart-open==5.2.1
snowballstemmer==2.2.0
sortedcontainers==2.4.0
SoundFile==0.10.3.post1
soupsieve==2.3.2.post1
spacy==3.3.1
spacy-legacy==3.0.9
spacy-loggers==1.0.2
Sphinx==1.8.6
sphinxcontrib-serializinghtml==1.1.5
sphinxcontrib-websupport==1.2.4
SQLAlchemy==1.4.37
sqlparse==0.4.2
srsly==2.4.3
statsmodels==0.10.2
sympy==1.7.1
tables==3.7.0
tabulate==0.8.9
tblib==1.7.0
tenacity==8.0.1
tensorboard==2.8.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.8.2+zzzcolab20220527125636
tensorflow-data-validation==1.8.0
tensorflow-datasets==4.0.1
tensorflow-estimator==2.8.0
tensorflow-gcs-config==2.8.0
tensorflow-hub==0.12.0
tensorflow-io-gcs-filesystem==0.26.0
tensorflow-metadata==1.8.0
tensorflow-model-analysis==0.39.0
tensorflow-probability==0.16.0
tensorflow-serving-api==2.8.2
tensorflow-transform==1.8.0
termcolor==1.1.0
terminado==0.13.3
testpath==0.6.0
text-unidecode==1.3
textblob==0.15.3
tfx==1.8.0
tfx-bsl==1.8.0
Theano-PyMC==1.1.2
thinc==8.0.17
threadpoolctl==3.1.0
tifffile==2021.11.2
tinycss2==1.1.1
tomli==2.0.1
toolz==0.11.2
torch @ https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp37-cp37m-linux_x86_64.whl
torchaudio @ https://download.pytorch.org/whl/cu113/torchaudio-0.11.0%2Bcu113-cp37-cp37m-linux_x86_64.whl
torchsummary==1.5.1
torchtext==0.12.0
torchvision @ https://download.pytorch.org/whl/cu113/torchvision-0.12.0%2Bcu113-cp37-cp37m-linux_x86_64.whl
tornado==5.1.1
tqdm==4.64.0
traitlets==5.1.1
tweepy==3.10.0
typeguard==2.7.1
typer==0.4.1
typing-extensions==4.1.1
typing-utils==0.1.0
tzlocal==1.5.1
uritemplate==3.0.1
urllib3==1.24.3
vega-datasets==0.9.0
wasabi==0.9.1
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==1.3.3
Werkzeug==1.0.1
widgetsnbextension==3.6.0
wordcloud==1.5.0
wrapt==1.14.1
xarray==0.20.2
xarray-einstats==0.2.2
xgboost==0.90
xkit==0.0.0
xlrd==1.1.0
xlwt==1.3.0
yellowbrick==1.4
zict==2.2.0
zipp==3.8.0
Describe the current behavior
Genrating statistics should work without generating error(s).
Describe the expected behavior
No errror should be generated and statistics should be returned when context runs statistics generator.
Standalone code to reproduce the issue
statistics_gen = tfx.components.StatisticsGen(
examples=example_gen.outputs['examples'])
context.run(statistics_gen, enable_cache=True)
Providing a bare minimum test case or step(s) to reproduce the problem will greatly help us to debug the issue. If possible, please share a link to Colab/Jupyter/any notebook.
Notebook: https://colab.research.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/tfx/components_keras.ipynb
Name of your Organization (Optional)
Other info / logs
Please install this version to fix this issue: !pip install -U apache-beam==2.39.0
Here's a related ticket: https://github.com/tensorflow/data-validation/issues/217
Include any logs or source code that would be helpful to diagnose the problem.
If including tracebacks, please include the full traceback. Large logs and files
should be attached.
Thank you so much for creating this issue! Actually the issue was resolved in the latest version of tensorflow-data-validation and tfx(1.9.0) that released a few days ago. I've confirmed that colab works well because it always installs the latest version.
@mygithubid1,
Can you please take a look at the above comment by @jiyongjung0 and let us know if your issue is resolved. Thank you!
Things work now. Thanks.