lit icon indicating copy to clipboard operation
lit copied to clipboard

No widget displays when use LIT in SageMaker (Notebook Instance)

Open superhez opened this issue 3 years ago • 10 comments
trafficstars

Hi! I an trying the demo below in both my local jupyterlab and AWS SageMaker (Notebook Instance). Everything goes well in my local jupyter, but in SageMaker, no widget displays and the cell's output shows a server connection timeout error. I have confirmed the version of related packages are almost same in the two environments. Could anyone give me some advice to deal with the problem?

DEMO: https://colab.research.google.com/github/PAIR-code/lit/blob/main/lit_nlp/examples/notebooks/LIT_sentiment_classifier.ipynb

Browser: Microsoft Edge 101.0.1210.53

■SageMaker notebook instance Python==3.7.12 jupyterlab==1.2.21 tensorflow==2.6.2 lit_nlp==0.4.1 tfds-nightly==4.5.2.dev202202230044 transformers==4.1.1 tensorflow-datasets== 4.5.2

※No VPC

■Kernel: conda_python3 Python==3.6.13 jupyterlab==1.2.21 tensorflow==2.6.2 lit_nlp==0.4.1 tfds-nightly==4.5.2.dev202202230044 transformers==4.1.1 tensorflow-datasets== 4.5.2

■My local Jupyterlab(for reference) Python==3.7.12 jupyterlab==1.2.6 or 3.3.2 (both OK) tensorflow==2.6.2 lit_nlp==0.4.1 tfds-nightly==4.5.2.dev202202230044 transformers==4.1.1 tensorflow-datasets== 4.5.2

superhez avatar May 19 '22 08:05 superhez

We have not tried LIT in SageMaker before, so it's not surprising that there might be some issues. Perhaps it could be related to a proxy server being used by the notebook instance. The LitWidget class does accept a proxy URL as a constructor argument for environments where this is the case. That was an issue for using LIT in Google Vertex AI Workbench notebooks, and in that case the proxy URL being set to "/proxy/%PORT%/" solves the issue. Let me know if you are able to make progress.

jameswex avatar May 19 '22 12:05 jameswex

Hello, jameswex. Thanks so much for your reply! Following with your advice, I tried "/proxy/%PORT%/" and the LIT's GUI now can be displayed!

However, a new issue comes out now that only data list is shown in the GUI, but there is no classification result and the realtime prediction does not work. An error says "Uncaught error: 'NoneType' object is not subscriptable (and 3 other errors)" at the bottom of GUI. When I pressed F12 to check the browser's network activities, bad requirements named as "get_preds?..." (URL: https://MY_SAGEMAKER_URL/PROXY/PORT/get_preds?model=sst_tiny...) gave a status code "500 Internal Server Error".

While, in my local jupyterlab, everything goes well when the same operation connect with "http://localhost:PORT/get_preds?..." So, it seems that when the GUI try to fetch the prediction results it accesses a wrong URL. I am not sure about that. Could you kindly give me some further advice?

We have not tried LIT in SageMaker before, so it's not surprising that there might be some issues. Perhaps it could be related to a proxy server being used by the notebook instance. The LitWidget class does accept a proxy URL as a constructor argument for environments where this is the case. That was an issue for using LIT in Google Vertex AI Workbench notebooks, and in that case the proxy URL being set to "/proxy/%PORT%/" solves the issue. Let me know if you are able to make progress.

superhez avatar May 20 '22 01:05 superhez

It's not simple for me to get an AWS instance to test with. Would you be willing to try this in your notebook:

import pandas as pd
pd.DataFrame(list(models['sst_tiny'].predict(datasets['sst_dev']._examples[0:2])))

This will verify if the model can successfully call predict on the dataset outside the scope of the app making calls to the backend. On success, you will see a dataframe with two rows of data from the results of the model on the first two datapoints.

jameswex avatar May 20 '22 12:05 jameswex

It's not simple for me to get an AWS instance to test with. Would you be willing to try this in your notebook:

import pandas as pd
pd.DataFrame(list(models['sst_tiny'].predict(datasets['sst_dev']._examples[0:2])))

This will verify if the model can successfully call predict on the dataset outside the scope of the app making calls to the backend. On success, you will see a dataframe with two rows of data from the results of the model on the first two datapoints.

Thanks for your kind reply. I added the code and the first two datapoints can be successfully listed with a series of columns named as "cls_emb", "input_embs", "layer_0/avg_emb"... "probas"... So that means there is no problem with reading the data or calling the prediction, the problem could be in the front end with the GUI, right? I thought everything had become OK after the GUI got successfully displayed, but the prediction did not work there, emmm...

superhez avatar May 23 '22 01:05 superhez

An error says "Uncaught error: 'NoneType' object is not subscriptable (and 3 other errors)" at the bottom of GUI.

If you click that error text at the bottom of the screen, does the dialog box show a more detailed error? If so, please paste here. Thanks.

jameswex avatar May 23 '22 12:05 jameswex

An error says "Uncaught error: 'NoneType' object is not subscriptable (and 3 other errors)" at the bottom of GUI.

If you click that error text at the bottom of the screen, does the dialog box show a more detailed error? If so, please paste here. Thanks.

Here are the"Error Details" below. Thank you.

Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 182, in _get_preds preds = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 305, in _get_interpretations model_outputs = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 182, in _get_preds preds = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 182, in _get_preds preds = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 305, in _get_interpretations model_outputs = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 305, in _get_interpretations model_outputs = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable Error: Uncaught error: 'NoneType' object is not subscriptable

Details: Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 191, in call return self._ServeCustomHandler(request, clean_path, environ)( File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/lib/wsgi_app.py", line 176, in _ServeCustomHandler return self._handlers[clean_path](self, request, environ) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 385, in _handler outputs = fn(data, **kw) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/lit_nlp/app.py", line 182, in _get_preds preds = self._predict(data['inputs'], model, dataset_name) TypeError: 'NoneType' object is not subscriptable

superhez avatar May 24 '22 00:05 superhez

Thanks! That seems to suggest that the payload in the get_preds HTTP Post request from the front-end to the backend, when the server is launched inside of SageMaker, is losing its request payload (data), which should be a dict that contains the datapoint IDs to get predictions for. I wonder if something about SageMaker's security model is affecting the performance of the LIT webserver running inside of it.

jameswex avatar May 26 '22 12:05 jameswex

Thanks! That seems to suggest that the payload in the get_preds HTTP Post request from the front-end to the backend, when the server is launched inside of SageMaker, is losing its request payload (data), which should be a dict that contains the datapoint IDs to get predictions for. I wonder if something about SageMaker's security model is affecting the performance of the LIT webserver running inside of it.

Thank you so much for the problem identification. With your information, I am now contacting the support center on the AWS side to checkout if some possible setting of SageMaker can help to solve the issue.

superhez avatar May 30 '22 01:05 superhez

Keep me updated on anything they say back. Hopefully there's some change we can make in LIT to support SageMaker, if we've identified the true problem here.

jameswex avatar May 31 '22 12:05 jameswex

Hi @jameswex @superhez were you able to run the demo in SageMaker notebook instance or SageMaker studio instance?

pri2si17-1997 avatar Nov 27 '22 19:11 pri2si17-1997