Archon 🐛 [Bug]: Errors in Server When Uploading Large LLMs File

Archon Version

Main Branch - Pulled 5th Sept

Bug Severity

🟠 High - Blocks important features

Bug Description

The upload function in the Knowledge base is non functional

Steps to Reproduce

Upload a large LLMs file (test was 4.9MB)

Expected Behavior

Document Processed

Actual Behavior

Errors as below

Error Details (if any)

2025-09-05 09:57:33 | src.server.models.progress_models | ERROR | Failed to create UploadProgressResponse: 1 validation error for UploadProgressResponse

status

  Input should be 'starting', 'reading', 'extracting', 'chunking', 'creating_source', 'summarizing', 'storing', 'completed', 'failed' or 'cancelled' [type=literal_error, input_value='document_storage', input_type=str]

    For further information visit https://errors.pydantic.dev/2.11/v/literal_error⁠

Traceback (most recent call last):

  File "/app/src/server/models/progress_models.py", line 239, in create_progress_response

    return model_class(**progress_data)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/venv/lib/python3.12/site-packages/pydantic/main.py", line 253, in __init__

    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)

                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

pydantic_core._pydantic_core.ValidationError: 1 validation error for UploadProgressResponse

status

  Input should be 'starting', 'reading', 'extracting', 'chunking', 'creating_source', 'summarizing', 'storing', 'completed', 'failed' or 'cancelled' [type=literal_error, input_value='document_storage', input_type=str]

Affected Component

🔍 Knowledge Base / RAG

Browser & OS

N/A

Additional Context

No response

Service Status (check all that are working)

[x] 🖥️ Frontend UI (http://localhost:3737)
[x] ⚙️ Main Server (http://localhost:8181)
[x] 🔗 MCP Service (localhost:8051)
[x] 🤖 Agents Service (http://localhost:8052)
[x] 💾 Supabase Database (connected)

Sep 05 '25 09:09 phill-bramble

@phill-bramble Hey, have you restarted your docker?

also, is this stopping the entire process or just giving the error?

This should not be a blocking error, its just missing a validation in progress tracking, im not able to reproduce this issue

Sep 05 '25 10:09 Wirasm

Hi!

Yes I've restarted. It appears to be stopping the progress from being reported to the UI but I am getting some other logs in between the flood, so I believe it is still running:

Input should be 'starting', 'reading', 'extracting', 'chunking', 'creating_source', 'summarizing', 'storing', 'completed', 'failed' or 'cancelled' [type=literal_error, input_value='document_storage', input_type=str]

    For further information visit [https://errors.pydantic.dev/2.11/v/literal_error⁠](https://errors.pydantic.dev/2.11/v/literal_error)

2025-09-05 10:00:03 | search | INFO | Batch 30: Generated 1/15 contextual embeddings using batch API (sub-batch size: 50)

2025-09-05 10:00:03 | src.server.services.llm_provider_service | INFO | Creating LLM client for provider: openai

2025-09-05 10:00:03 | src.server.services.llm_provider_service | INFO | OpenAI client created successfully

10:00:03.281 Getting progress for operation | operation_id=6ec9b5e9-d3b2-40f9-882d-a6859ff0a72e

2025-09-05 10:00:03 | src.server.models.progress_models | ERROR | Failed to create UploadProgressResponse: 1 validation error for UploadProgressResponse

status

  Input should be 'starting', 'reading', 'extracting', 'chunking', 'creating_source', 'summarizing', 'storing', 'completed', 'failed' or 'cancelled' [type=literal_error, input_value='document_storage', input_type=str]

...

Input should be 'starting', 'reading', 'extracting', 'chunking', 'creating_source', 'summarizing', 'storing', 'completed', 'failed' or 'cancelled' [type=literal_error, input_value='document_storage', input_type=str]

    For further information visit [https://errors.pydantic.dev/2.11/v/literal_error⁠](https://errors.pydantic.dev/2.11/v/literal_error)

2025-09-05 09:59:47 | search | INFO | Batch 28: Generated 1/15 contextual embeddings using batch API (sub-batch size: 50)

2025-09-05 09:59:47 | src.server.services.llm_provider_service | INFO | Creating LLM client for provider: openai

2025-09-05 09:59:47 | src.server.services.llm_provider_service | INFO | OpenAI client created successfully

09:59:48.027 Getting progress for operation | operation_id=6ec9b5e9-d3b2-40f9-882d-a6859ff0a72e

I'd say it is running, but no progress when uploading a file.

Sep 05 '25 10:09 phill-bramble

Just tested again having rebuild all containers and cleared cache. I get the same errors whilst uploading a file, but not when uploading the file to web server and using the HTTP method.

Sep 05 '25 10:09 phill-bramble

Also, errors only appear when the Chrome tab is active and polling, otherwise processing appears to be continuing as OpenAI clients are being created for each batch.

Sep 05 '25 10:09 phill-bramble

@phill-bramble so it only appears on large file uploads not on url crawling?

Sep 05 '25 12:09 Wirasm

@Wirasm That's correct.

The manual upload also doesn't seem to follow the same processing steps. It doesn't generate code examples for instance, even though the files I wish to manually upload have plenty throughout. I get far better results by uploading my files to sites and then using the crawling method.

Sep 05 '25 12:09 phill-bramble