Archon icon indicating copy to clipboard operation
Archon copied to clipboard

๐Ÿ› [Bug]: Errors in Server When Uploading Large LLMs File

Open phill-bramble opened this issue 3 months ago โ€ข 6 comments

Archon Version

Main Branch - Pulled 5th Sept

Bug Severity

๐ŸŸ  High - Blocks important features

Bug Description

The upload function in the Knowledge base is non functional

Steps to Reproduce

Upload a large LLMs file (test was 4.9MB)

Expected Behavior

Document Processed

Actual Behavior

Errors as below

Error Details (if any)

2025-09-05 09:57:33 | src.server.models.progress_models | ERROR | Failed to create UploadProgressResponse: 1 validation error for UploadProgressResponse

status

  Input should be 'starting', 'reading', 'extracting', 'chunking', 'creating_source', 'summarizing', 'storing', 'completed', 'failed' or 'cancelled' [type=literal_error, input_value='document_storage', input_type=str]

    For further information visit https://errors.pydantic.dev/2.11/v/literal_errorโ 

Traceback (most recent call last):

  File "/app/src/server/models/progress_models.py", line 239, in create_progress_response

    return model_class(**progress_data)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/venv/lib/python3.12/site-packages/pydantic/main.py", line 253, in __init__

    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)

                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

pydantic_core._pydantic_core.ValidationError: 1 validation error for UploadProgressResponse

status

  Input should be 'starting', 'reading', 'extracting', 'chunking', 'creating_source', 'summarizing', 'storing', 'completed', 'failed' or 'cancelled' [type=literal_error, input_value='document_storage', input_type=str]

Affected Component

๐Ÿ” Knowledge Base / RAG

Browser & OS

N/A

Additional Context

No response

Service Status (check all that are working)

  • [x] ๐Ÿ–ฅ๏ธ Frontend UI (http://localhost:3737)
  • [x] โš™๏ธ Main Server (http://localhost:8181)
  • [x] ๐Ÿ”— MCP Service (localhost:8051)
  • [x] ๐Ÿค– Agents Service (http://localhost:8052)
  • [x] ๐Ÿ’พ Supabase Database (connected)

phill-bramble avatar Sep 05 '25 09:09 phill-bramble

@phill-bramble Hey, have you restarted your docker?

also, is this stopping the entire process or just giving the error?

This should not be a blocking error, its just missing a validation in progress tracking, im not able to reproduce this issue

Wirasm avatar Sep 05 '25 10:09 Wirasm

Hi!

Yes I've restarted. It appears to be stopping the progress from being reported to the UI but I am getting some other logs in between the flood, so I believe it is still running:

Input should be 'starting', 'reading', 'extracting', 'chunking', 'creating_source', 'summarizing', 'storing', 'completed', 'failed' or 'cancelled' [type=literal_error, input_value='document_storage', input_type=str]

    For further information visit [https://errors.pydantic.dev/2.11/v/literal_errorโ ](https://errors.pydantic.dev/2.11/v/literal_error)

2025-09-05 10:00:03 | search | INFO | Batch 30: Generated 1/15 contextual embeddings using batch API (sub-batch size: 50)

2025-09-05 10:00:03 | src.server.services.llm_provider_service | INFO | Creating LLM client for provider: openai

2025-09-05 10:00:03 | src.server.services.llm_provider_service | INFO | OpenAI client created successfully

10:00:03.281 Getting progress for operation | operation_id=6ec9b5e9-d3b2-40f9-882d-a6859ff0a72e

2025-09-05 10:00:03 | src.server.models.progress_models | ERROR | Failed to create UploadProgressResponse: 1 validation error for UploadProgressResponse

status

  Input should be 'starting', 'reading', 'extracting', 'chunking', 'creating_source', 'summarizing', 'storing', 'completed', 'failed' or 'cancelled' [type=literal_error, input_value='document_storage', input_type=str]

...

Input should be 'starting', 'reading', 'extracting', 'chunking', 'creating_source', 'summarizing', 'storing', 'completed', 'failed' or 'cancelled' [type=literal_error, input_value='document_storage', input_type=str]

    For further information visit [https://errors.pydantic.dev/2.11/v/literal_errorโ ](https://errors.pydantic.dev/2.11/v/literal_error)

2025-09-05 09:59:47 | search | INFO | Batch 28: Generated 1/15 contextual embeddings using batch API (sub-batch size: 50)

2025-09-05 09:59:47 | src.server.services.llm_provider_service | INFO | Creating LLM client for provider: openai

2025-09-05 09:59:47 | src.server.services.llm_provider_service | INFO | OpenAI client created successfully

09:59:48.027 Getting progress for operation | operation_id=6ec9b5e9-d3b2-40f9-882d-a6859ff0a72e

I'd say it is running, but no progress when uploading a file.

phill-bramble avatar Sep 05 '25 10:09 phill-bramble

Just tested again having rebuild all containers and cleared cache. I get the same errors whilst uploading a file, but not when uploading the file to web server and using the HTTP method.

phill-bramble avatar Sep 05 '25 10:09 phill-bramble

Also, errors only appear when the Chrome tab is active and polling, otherwise processing appears to be continuing as OpenAI clients are being created for each batch.

phill-bramble avatar Sep 05 '25 10:09 phill-bramble

@phill-bramble so it only appears on large file uploads not on url crawling?

Wirasm avatar Sep 05 '25 12:09 Wirasm

@Wirasm That's correct.

The manual upload also doesn't seem to follow the same processing steps. It doesn't generate code examples for instance, even though the files I wish to manually upload have plenty throughout. I get far better results by uploading my files to sites and then using the crawling method.

phill-bramble avatar Sep 05 '25 12:09 phill-bramble