elasticsearch-labs icon indicating copy to clipboard operation
elasticsearch-labs copied to clipboard

Increase Timeout in demo?

Open epugh opened this issue 1 year ago • 2 comments

I noticed that with my under powered laptop, when I run ES in Docker, and run the flask create-index i get a time out the first time, but it works teh second time:

(.venv) ➜  chatbot-rag-app git:(main) ✗ flask create-index
".elser_model_2" model not available, downloading it now
Model downloaded, starting deployment
Traceback (most recent call last):
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/bin/flask", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/flask/cli.py", line 1064, in main
    cli.main()
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/flask/cli.py", line 358, in decorator
    return __ctx.invoke(f, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/api/app.py", line 36, in create_index
    index_data.main()
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/api/../data/index_data.py", line 60, in main
    install_elser()
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/api/../data/index_data.py", line 54, in install_elser
    elasticsearch_client.ml.start_trained_model_deployment(
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elasticsearch/_sync/client/utils.py", line 446, in wrapped
    return api(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elasticsearch/_sync/client/ml.py", line 3814, in start_trained_model_deployment
    return self.perform_request(  # type: ignore[return-value]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elasticsearch/_sync/client/_base.py", line 389, in perform_request
    return self._client.perform_request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elasticsearch/_sync/client/_base.py", line 285, in perform_request
    meta, resp_body = self.transport.perform_request(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elastic_transport/_transport.py", line 329, in perform_request
    meta, raw_data = node.perform_request(
                     ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/epugh/Documents/clients/OSC/rag/elasticsearch-labs/example-apps/chatbot-rag-app/.venv/lib/python3.11/site-packages/elastic_transport/_node/_http_urllib3.py", line 199, in perform_request
    raise err from None
elastic_transport.ConnectionTimeout: Connection timed out

epugh avatar Aug 06 '24 12:08 epugh

I'm not really sure a timeout increase will help. I've found that sometimes the ML node does not seem to respond immediately after deploying a model, and regardless of how long you wait the first request always times out and then everything starts to work.

I could not really find a solution, I don't see this happening very often, but it does happen to me sometimes. Do you have a suggestion of a timeout to use that works reliably for you?

miguelgrinberg avatar Aug 06 '24 15:08 miguelgrinberg

I just assumed a longer timeout would work... Though maybe what we really need is a retry ;-)

epugh avatar Aug 26 '24 21:08 epugh

I've noticed that even if you pass the elser install, use later will timeout, depending on your docker config for several minutes. Once it is all done, things work fine. So, folks using this example probably should retry until it works before showing anyone, and just keep that ES configuration hot.

It would be nice to have a solution that shortens this. Possibly easier said than done thoughts:

  • ES image that bakes in elser, so you don't need to download it
  • faster provisioning time, especially in single-node dev setup

codefromthecrypt avatar Jan 04 '25 01:01 codefromthecrypt

@serenachou is this something your team can help with?

daniela-elastic avatar Jan 27 '25 11:01 daniela-elastic

I started looking into this. The timeout occurs due to saturation of ML tasks in elasticsearch. It may be possible to rewrite the initialization so that it blocks until they are complete. I have some ideas and want to try to solve this before too many people hit this same problem.

codefromthecrypt avatar Feb 19 '25 06:02 codefromthecrypt

https://github.com/elastic/elasticsearch-labs/pull/397 should fix the user experience, though the time until ready could be better. In any case, this was raised about the user experience so it should close it.

codefromthecrypt avatar Feb 19 '25 09:02 codefromthecrypt

Thanks for the update!

epugh avatar Feb 19 '25 12:02 epugh

PS we've reproduced that the worst of timeouts will happen on non x86 machines, which sadly means anyone on a recent macbook.

codefromthecrypt avatar Feb 20 '25 07:02 codefromthecrypt