model-analysis icon indicating copy to clipboard operation
model-analysis copied to clipboard

TFMA not rendering in JupyterLab

Open ConverJens opened this issue 4 years ago • 75 comments

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow Model Analysis): Yes, minor
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04 and image python:3.7-slim
  • TensorFlow Model Analysis installed from (source or binary): pip install
  • TensorFlow Model Analysis version (use command below): 0.27.0
  • Python version: 3.7
  • Jupyter Notebook version: jupyterlab 2.2.9
  • Exact command to reproduce: see sample notebook

Describe the problem

tfma.view.render_slicing_metrics shows no output.

Source code / logs

Slim docker image to reproduce the issue:

FROM python:3.7-slim

ENV DEBIAN_FRONTEND=noninteractive

# This is used because our k8s cluster can only access our internal pypi
#COPY pip.conf /etc/pip.conf

# # TFMA is installed in the notebook because pip complained otherwise
RUN python3.7 -m pip install --no-cache-dir jupyterlab==2.2.9

# Install Node (for jupyter lab extensions)
RUN apt update && \
    apt -y install nano curl dirmngr apt-transport-https lsb-release ca-certificates && \
    curl -L https://deb.nodesource.com/setup_15.x | bash - && \
    apt update && apt install -y nodejs && \
    node -v

RUN jupyter labextension install [email protected] && \
    jupyter labextension install @jupyter-widgets/jupyterlab-manager@2

RUN jupyter lab build

ENV DEBIAN_FRONTEND=

ENV NB_PREFIX /
ENV SHELL=/bin/bash

# Standard KubeFlow jupyter startup command
CMD ["/bin/bash","-c", "jupyter lab --notebook-dir=/home/jovyan --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]

Below is an evaluation artifact from a small TFX pipeline and a minimal notebook to reproduce. Notebook consists of unzipping, install tfma, load eval result and try to display. 3053.zip tfma-render-issue.ipynb.zip

Edit: I've also run the above with node 12 instead of 15 with the exact same result.

ConverJens avatar Feb 19 '21 13:02 ConverJens

cc: @atn832

fhuanming avatar Feb 19 '21 19:02 fhuanming

Thanks for sharing the steps. I haven't tried reproducing yet, but judging from the error you posted at https://github.com/tensorflow/model-analysis/issues/56#issuecomment-780372449, it seems like the TFMA extension is downloading vulcanized_tfma.js from the wrong URL. As a result, instead of actual Javascript, it gets some kind of markup, probably something like:

<!doctype html>
<html>
  ...
  file not found
  ...
</html>

It reads the first < and returns this error:

Uncaught SyntaxError: Unexpected token '<' vulcanized_tfma.js:1

In the Network tab of the Chrome debugger tool, can you check the url of the request for vulcanized_tfma.js and share it here?

atn832 avatar Feb 22 '21 08:02 atn832

@atn832 If I understand correctly, the url is: :31380/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js

Is TFMA trying to download anything? We are running in an on-prem cluster with no external access so if TFMA needs external access that would be an issue.

ConverJens avatar Feb 22 '21 08:02 ConverJens

Getting the same issue in https://github.com/kubeflow/pipelines/issues/5194, we use technique in https://github.com/tensorflow/model-analysis/issues/10#issuecomment-595618035 to visualize TFMA as html and then embed that HTML in a different place in iframe.

The problem is similar to this reported issue that vulcanized_tfma.js doesn't load. I verified that TFMA 0.26.0 still works for us, so the regression happens between the two versions.

Bobgy avatar Feb 25 '21 11:02 Bobgy

@Bobgy Is the trick with embedding the html required? This is not something I usually use, for instance when rendering stats from TFDV.

ConverJens avatar Feb 25 '21 11:02 ConverJens

My use case

I generate the html in a step in a pipeline, and the pipeline UI shows the html in an iframe. but that's not related to this issue

Bobgy avatar Feb 26 '21 09:02 Bobgy

Hi @ConverJens, besides what @atn832 asked, I have run the dockerfile with the notebook and data that you shared above (by docker run -p 8888:8888 {image_name}). I can successfully load the TFMA UI on the JupyterLab.

In terms of

Is TFMA trying to download anything?

I don't know what the answer is. Maybe @atn832 can give the answer. I only know that running jupyter labextension install [email protected] will download TFMA js packages from NPM.

fhuanming avatar Mar 01 '21 18:03 fhuanming

Upon running render_slicing_metrics in a cell, Chrome will download vulcanized_tfma.js. This is hosted by Jupyter Lab once you install the TFMA extension, so it should work even without internet.

image

@ConverJens, can you share what you see in the Response tab for the vulcanized_tfma.js request? I expect it'll show some HTML with an error message that might tell us what is wrong.

On my machine, since it worked, I can see a bunch of Javascript comments followed by Javascript code like this: image

atn832 avatar Mar 02 '21 04:03 atn832

@atn832 You are completely right! The response I get is basically Please enable JavaScript to view this website. which is very weird since my chrome settings says it's allowed for all sites, which is the recommended setting. I'm using latest version of Chrome (88). Full response below. I get the same response when using Safari.

Any idea how to proceed?

@Bobgy You mentioned that you had this issue in KubeFlow as well. Is there some interaction with in KF that leads the browser into believing that JavaScript is disabled or otherwise may be causing this issue?

<!doctype html><html lang="en"><head><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width,minimum-scale=1,initial-scale=1,user-scalable=yes"><meta name="description" content="Kubeflow Central Dashboard"><meta name="theme-color" content="#3f51b5"><title>Kubeflow Central Dashboard</title><link rel="shortcut icon" href="/assets/favicon.ico"><link rel="icon" href="/assets/favicon-32x32.png" sizes="32x32"><link rel="icon" href="/assets/favicon-57x57.png" sizes="57x57"><link rel="icon" href="/assets/favicon-76x76.png" sizes="76x76"><link rel="icon" href="/assets/favicon-96x96.png" sizes="96x96"><link rel="icon" href="/assets/favicon-128x128.png" sizes="128x128"><link rel="icon" href="/assets/favicon-192x192.png" sizes="192x192"><link rel="apple-touch-icon" href="/assets/favicon-152x152.png" sizes="152x152"><link rel="apple-touch-icon" href="/assets/favicon-180x180.png" sizes="180x180"><base href="/"><script src="webcomponentsjs/webcomponents-loader.js"></script><script src="webcomponentsjs/custom-elements-es5-adapter.js"></script><link href="app.css" rel="stylesheet"></head><body><main-page></main-page><noscript>Please enable JavaScript to view this website.</noscript><script src="vendor.bundle.js" defer="defer"></script><script src="app.bundle.js" defer="defer"></script><script src="dashboard_lib.bundle.js" defer="defer"></script></body></html>

ConverJens avatar Mar 02 '21 09:03 ConverJens

@fhuanming @Bobgy Note that I'm rendering TFMA works for me locally as well when running the image and data I supplied, but the exact same rendering fails in my KubeFlow hosted notebook. I'm don't know if this issue is because of something in KubeFlow or from the fact that our k8s cluster has no external access.

@atn832 Do you know if TFMA needs to be able to reach NPM at runtime?

@Bobgy Any ideas if KubeFlow itself is blocking something?

ConverJens avatar Mar 02 '21 11:03 ConverJens

@atn832 Hi, I'm also running into a similar issue with a hosted notebook solution (not Kubeflow). I'm also using the latest version of Chrome (88), and the notebook lives in a k8s cluster with no external access. However, in my case, no response is returned.

When I try to render, this happens Network request: Screen Shot 2021-03-09 at 11 59 06 AM

404 response: Screen Shot 2021-03-09 at 12 27 55 PM

I'd appreciate your help.

mwakaba2 avatar Mar 09 '21 19:03 mwakaba2

@mwakaba2 the fact that you're getting a 404 makes me wonder about where it's trying to load vulcanized_tfma.js from. Could you hover over the filename in the network request and give us the whole URL?

When I load that, it loads from:

https://localhost:8080/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js

which makes me wonder if this is a port problem, or a path problem, file permissions, or maybe an HTTPS problem (but I don't think that would be a 404). My guess is either a path or file permissions problem.

rcrowe-google avatar Mar 10 '21 19:03 rcrowe-google

@rcrowe-google Yeah that's the path.

/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js

I think the issue is that our hosted notebook solution is using jupyter_server in the backend. Jupyter_server by default doesn't support the nbextensions path. So to enable that, we have to enable nbclassic server extension.

I still get a 404 after that, so I think I also need to enable the extensions via nbextension install to add the "tensorflow_model_analysis/vulcanized_tfma.js" path.

mwakaba2 avatar Mar 10 '21 21:03 mwakaba2

Ok I got it working in Jupyterlab by installing and enabling the nbextensions.

 $ jupyter nbextension enable --py widgetsnbextension
 $ jupyter nbextension enable --py tensorflow_model_analysis

You need that in order to access the /nbextensions path.

mwakaba2 avatar Mar 10 '21 22:03 mwakaba2

Glad to hear you solved it Mariko! It seems so simple once you know the answer ... 😁

rcrowe-google avatar Mar 11 '21 02:03 rcrowe-google

@rcrowe-google there's still an upstream problem here FYI.

in the Jupyter ecosystem, there are two primary types of frontend extensions (this doc may be helpful):

  • nbextensions - Jupyter Classic Notebook extensions - which support the classic Notebook UI.
  • labextensions - Jupyter Lab Extensions - which support the modern Jupyter Lab UI.

nbextensions are effectively deprecated in favor of labextensions, because all modern notebook runtimes (incl AI Platform Notebooks etc are fronted with Jupyter Lab). so in order to enable nbextensions (Jupyter Classic Notebook extensions) on top of Jupyter Server (which replaces the Jupyter Notebook server) to support serving of this Javascript, we had to install a compatibility layer that otherwise wouldn't need to exist (called nbclassic).

so tl;dr: the problem with TFMA's JupyterLab support is that it's not a pure JupyterLab extension. It's a labextension that depends on a nbextension to work. what we've employed here is a workaround vs a proper fix. the proper fix would involve making the labextension carry its own dependencies vs the existing hybrid model (which is unprecedented btw).

this gap is likely where the other folks like @ConverJens are running into issues making this work - as it's non-obvious vs the way any other lab extension works.

kwlzn avatar Mar 12 '21 00:03 kwlzn

@mwakaba2 @rcrowe-google I tried the fix that you mentioned but there was no difference for me, still no output. Albeit, the error I had was different from the one @mwakaba2 experienced.

@kwlzn I tried installing nbclassic but with no progress. I used to start my server by running jupyter lab ... but should I start it with another command for this to have effect?

ConverJens avatar Mar 12 '21 13:03 ConverJens

@ConverJens You don't need nbclassic because jupyterlab 2.2.9 by default relies on the jupyter/notebook as the backend. If you can access /tree then, there's no need for nbclassic. did you try installing the nbextensions in the docker image right after installing the lab extension?

jupyter nbextension install --py widgetsnbextension
jupyter nbextension enable --py widgetsnbextension
jupyter nbextension install --py tensorflow_model_analysis
jupyter nbextension enable --py tensorflow_model_analysis

My tfma setup is probably different from yours. I used these instructions to install the tfma npm package. https://github.com/tensorflow/model-analysis/issues/56#issuecomment-780722092

mwakaba2 avatar Mar 12 '21 19:03 mwakaba2

@mwakaba2 That was my understanding but I wanted to verify.

These are the commands I'm currently using:

RUN jupyter contrib nbextension install && \
    jupyter labextension install [email protected] && \
    jupyter labextension install @jupyter-widgets/jupyterlab-manager@2  && \
    jupyter nbextension install --py --sys-prefix widgetsnbextension && \
    jupyter nbextension enable --py --sys-prefix widgetsnbextension && \
    jupyter nbextension install --py --sys-prefix tensorflow_model_analysis && \
    jupyter nbextension enable --py --sys-prefix tensorflow_model_analysis

along with pip installing the corresponding TFX version first. I've tested 0.26.0, 0.27.0 and 0.28.0 with exactly the same result. I still cannot load the vulcanized_tfma.js file.

@kwlzn @rcrowe-google I'm not certain the issue that you are mentioning is actually what's causing my problem. I can run my image, notebook and data locally successfully but once I host it, it fails. I can run it locally without the jupyter nbextension steps.

ConverJens avatar Mar 15 '21 14:03 ConverJens

@atn832 @fhuanming @mwakaba2 @kwlzn @rcrowe-google So I think I found whats causing this issue: url rewrite in KubeFlow is causing the vulcanized_tfma path to point to the wrong place.

If I check the sources tab in the developer console and right click the vulcanized_tfma.js, and choose "open in new tab" I'm directed to: <kubeflow host>/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js which is incorrect since the notebook lives in: <kubeflow host>/notebook/admin/<notebook server name>. I'm also greated by KubeFlows 'this page doesn't exist' page.

If I manually concatenate these urls I get the actual js file: <kubeflow host>/notebook/admin/<notebook server name>/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js

So it seems that this is actually a path issue. This also perfectly explains why it works when running the image locally but not with the re-written url.

How do we proceed with this?

ConverJens avatar Mar 15 '21 14:03 ConverJens

Thank you for the details! We were able to reproduce the loading issue in our own environment and are working on a fix.

atn832 avatar Mar 18 '21 03:03 atn832

@atn832

Out of curiosity, how is vulcanized_tfma.js produced ? I was not able to decipher it grepping through the project.

It's ideal to drop the requirement of nbclassic if a user is using JupyterLab since a user should be able to use this plugin with just JupyterLab.

jhamet93 avatar Mar 18 '21 22:03 jhamet93

@atn832 Fantastic, thank you!

@jhamet93 TFMA does not have nbclassic as a requirement, it was just proposed as an optional fix if you're running jupyter notebook as your backend instead of jupyter server.

ConverJens avatar Mar 19 '21 06:03 ConverJens

@ConverJens The latest JupyterLab major version (which uses Jupyter Server as the backend) is planning on dropping the nbclassic extension. Currently, there exists a shim to help ease users who are transitioning but this will cease to exist in the future. Thus, it makes sense for this to evolve to work out of the box with just JupyterLab since this seems like an antipattern. The nbclassic plugin is only needed for the JupyterLab extension to serve a static file which should be able to be hosted by a different mechanism such as a server extension.

jhamet93 avatar Mar 19 '21 13:03 jhamet93

@atn832 Any update on this?

ConverJens avatar Apr 19 '21 09:04 ConverJens

@atn832 Any update? I'll keep pinging :)

ConverJens avatar Apr 29 '21 07:04 ConverJens

Yes, my fix was merged a few days ago (https://github.com/tensorflow/model-analysis/commit/cc7d75c1bf588123795511572e1b4445d9a52191) and this issue should be resolved on the next release.

atn832 avatar Apr 29 '21 08:04 atn832

Awsome! So it will be part of the 0.31.0 (or if it is the 1.0.0) release then?

ConverJens avatar Apr 29 '21 09:04 ConverJens

@ConverJens according to the release notes @atn832 added to, it's in 0.29

rclough avatar May 13 '21 13:05 rclough

Ah yes, I misplaced this release note! It should be part of 0.31.0 instead.

atn832 avatar May 13 '21 13:05 atn832