DocsGPT icon indicating copy to clipboard operation
DocsGPT copied to clipboard

elasticsearch_url crash in HEAD

Open rc9000 opened this issue 2 years ago • 1 comments

When following the quick start instructions and trying to index a document after ./run-with-docker-compose.sh:

docsgpt-worker-1    | [2023-10-01 09:36:43,450: INFO/MainProcess] Task application.api.user.tasks.ingest[1eabb828-445e-4bb7-828c-10125e77a741] received
docsgpt-worker-1    | [2023-10-01 09:36:43,451: WARNING/ForkPoolWorker-6] inputs/local/everything.zip
docsgpt-worker-1    | [2023-10-01 09:36:43,484: WARNING/ForkPoolWorker-6] <Response [200]>
docsgpt-worker-1    | [2023-10-01 09:36:51,673: WARNING/ForkPoolWorker-6] Grouping small documents
docsgpt-worker-1    | [2023-10-01 09:36:53,123: WARNING/ForkPoolWorker-6] Separating large documents
docsgpt-worker-1    | [2023-10-01 09:36:53,839: ERROR/ForkPoolWorker-6] Task application.api.user.tasks.ingest[1eabb828-445e-4bb7-828c-10125e77a741] raised unexpected: ValueError('Please provide either elasticsearch_url or cloud_id.')
docsgpt-worker-1    | Traceback (most recent call last):
docsgpt-worker-1    |   File "/usr/local/lib/python3.10/site-packages/celery/app/trace.py", line 451, in trace_task
docsgpt-worker-1    |     R = retval = fun(*args, **kwargs)
docsgpt-worker-1    |   File "/usr/local/lib/python3.10/site-packages/celery/app/trace.py", line 734, in __protected_call__
docsgpt-worker-1    |     return self.run(*args, **kwargs)
docsgpt-worker-1    |   File "/app/application/api/user/tasks.py", line 6, in ingest
docsgpt-worker-1    |     resp = ingest_worker(self, directory, formats, name_job, filename, user)
docsgpt-worker-1    |   File "/app/application/worker.py", line 78, in ingest_worker
docsgpt-worker-1    |     call_openai_api(docs, full_path, self)
docsgpt-worker-1    |   File "/app/application/parser/open_ai_func.py", line 48, in call_openai_api
docsgpt-worker-1    |     store = VectorCreator.create_vectorstore(
docsgpt-worker-1    |   File "/app/application/vectorstore/vector_creator.py", line 16, in create_vectorstore
docsgpt-worker-1    |     return vectorstore_class(*args, **kwargs)
docsgpt-worker-1    |   File "/app/application/vectorstore/elasticsearch.py", line 35, in __init__
docsgpt-worker-1    |     raise ValueError("Please provide either elasticsearch_url or cloud_id.")
docsgpt-worker-1    | ValueError: Please provide either elasticsearch_url or cloud_id.
docsgpt-backend-1   | [2023-10-01 09:36:58 +0000] [8] [ERROR] Error handling request /api/task_status?task_id=1eabb828-445e-4bb7-828c-10125e77a741

The progress bar then keeps frozen at 1% and nothing is indexed.

On a hunch I tried VECTOR_STORE=faiss in .env but that didn't help. I went back a few commits and c1c54f4 still works fine.

Very cool otherwise, I was experimenting with a CLI RAG application and this makes it so much nicer!

rc9000 avatar Oct 01 '23 10:10 rc9000

Would like to work on this if no one already is.

bablookr avatar Oct 01 '23 17:10 bablookr

Seems like a env variable issue, maybe you have some env vars that are already in you shell. But please do try it again, in case you want to use elasticsearch you will need to fill other variables in core/settings

dartpain avatar Sep 12 '24 09:09 dartpain