elasticsearch-labs
elasticsearch-labs copied to clipboard
Increase Timeout in demo?
I noticed that with my under powered laptop, when I run ES in Docker, and run the flask create-index i get a time out the first time, but it works teh second time:
(.venv) ➜ chatbot-rag-app git:(main) ✗ flask create-index
".elser_model_2" model not available, downloading it now
Model downloaded, starting deployment
Traceback (most recent call last):
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/bin/flask", line 8, in <module>
sys.exit(main())
^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/flask/cli.py", line 1064, in main
cli.main()
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/flask/cli.py", line 358, in decorator
return __ctx.invoke(f, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/api/app.py", line 36, in create_index
index_data.main()
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/api/../data/index_data.py", line 60, in main
install_elser()
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/api/../data/index_data.py", line 54, in install_elser
elasticsearch_client.ml.start_trained_model_deployment(
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elasticsearch/_sync/client/utils.py", line 446, in wrapped
return api(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elasticsearch/_sync/client/ml.py", line 3814, in start_trained_model_deployment
return self.perform_request( # type: ignore[return-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elasticsearch/_sync/client/_base.py", line 389, in perform_request
return self._client.perform_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elasticsearch/_sync/client/_base.py", line 285, in perform_request
meta, resp_body = self.transport.perform_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elastic_transport/_transport.py", line 329, in perform_request
meta, raw_data = node.perform_request(
^^^^^^^^^^^^^^^^^^^^^
File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elastic_transport/_node/_http_urllib3.py", line 199, in perform_request
raise err from None
elastic_transport.ConnectionTimeout: Connection timed out
I'm not really sure a timeout increase will help. I've found that sometimes the ML node does not seem to respond immediately after deploying a model, and regardless of how long you wait the first request always times out and then everything starts to work.
I could not really find a solution, I don't see this happening very often, but it does happen to me sometimes. Do you have a suggestion of a timeout to use that works reliably for you?
I just assumed a longer timeout would work... Though maybe what we really need is a retry ;-)
I've noticed that even if you pass the elser install, use later will timeout, depending on your docker config for several minutes. Once it is all done, things work fine. So, folks using this example probably should retry until it works before showing anyone, and just keep that ES configuration hot.
It would be nice to have a solution that shortens this. Possibly easier said than done thoughts:
- ES image that bakes in elser, so you don't need to download it
- faster provisioning time, especially in single-node dev setup
@serenachou is this something your team can help with?
I started looking into this. The timeout occurs due to saturation of ML tasks in elasticsearch. It may be possible to rewrite the initialization so that it blocks until they are complete. I have some ideas and want to try to solve this before too many people hit this same problem.
https://github.com/elastic/elasticsearch-labs/pull/397 should fix the user experience, though the time until ready could be better. In any case, this was raised about the user experience so it should close it.
Thanks for the update!
PS we've reproduced that the worst of timeouts will happen on non x86 machines, which sadly means anyone on a recent macbook.