RAG endpoint returns 500 Timeout (15 s) when called with a new user’s API key
Describe the bug
Calling the SciPhi RAG endpoint (POST https://api.sciphi.ai/v3/retrieval/rag) with a new user’s API key returns
500 Internal RAG Error – Request timed out after 15.0 seconds instead of the expected RAG result.
To Reproduce
R2R Explorer
- Open the R2R Explorer.
- Enter a valid query, your
collection_id, and the new user’s API key. - Click Send Request.
- Observe the 500 response.
SS: https://r2r-docs.sciphi.ai/api-and-sdks/retrieval/rag-app/~explorer
SS:
[https://](https://api.sciphi.ai/v3/users/me) response shows the expected Collection Id attached with the new user.
Below one is from my python server, Where I received for the first time
import requests, json, pprint
rag_payload = {
"query": "your query here",
"search_mode": "advanced",
"search_settings": {
"use_hybrid_search": False,
"use_semantic_search": True,
"limit": 50,
"chunk_settings": {"index_measure": "cosine_distance", "enabled": True},
"graph_settings": {"enabled": True},
"filters": {"collection_id": "<YOUR_ROOT_COLLECTION_ID>"},
},
"rag_generation_config": {
"model": "anthropic/claude-3-7-sonnet-20250219",
"max_tokens_to_sample": 16000,
"stream": False,
"extended_thinking": True,
"thinking_budget": 4096,
"temperature": 1,
"top_p": None,
},
"include_title_if_available": True,
"include_web_search": False,
}
resp = requests.post(
"https://api.sciphi.ai/v3/retrieval/rag",
headers={
"x-api-key": "NEW_USER_API_KEY",
"Content-Type": "application/json",
},
json=rag_payload,
timeout=(5, 60), # (connect, read)
)
print("Status code:", resp.status_code)
pprint.pprint(resp.json())
Expected
A 200 OK response containing the usual RAG structure:
{
"retrieval_results": [...],
"generation_result": {...}
}
Actual
{
"detail": {
"message": "An error '500: Internal RAG Error - Request timed out after 15.0 seconds' occurred during rag_app",
"error": "500: Internal RAG Error - Request timed out after 15.0 seconds",
"error_type": "HTTPException"
}
}
Screenshots / Logs
Environment
Not applicable
Smartphone
Not applicable
Additional context
- A new user was added to my collection, and then the new user's API key was used; Can't test it for older admin keys as it has reached the monthly due to past days rag API issue.
- The error occurs consistently even for trivial queries and empty collections.
- No evidence of rate‑limit or quota issues (
429) for new user; - The behaviour started on 2025‑04‑21 (Asia/Kolkata, IST) and persists.
Update:
- Tested with the collections owner's API key, but the user is also encountering the same error as before. It seems the monthly limit was reached due to the past days' rag API issue. So, I purchased a starter pack to resolve this, yet the issue persists as shown below.
How are you calling this in the second response? From the API explorer, still?
Hi @NolanTrem , I appreciate your response. I am currently reproducing the issue in both my Streamlit application and the API explorer. To determine whether the issue lies in my code or with the API, I have been using the API explorer to test the same payload at both the places. However, I am still experiencing a "Request Timeout" error in both environments.
import requests
# RAG Query (POST /v3/retrieval/rag)
response = requests.post(
"https://api.sciphi.ai/v3/retrieval/rag",
headers={
"Authorization": "Bearer pk_YmCx83kV...................3tCrhyCZjEBvmSfJqdrUioB",
"X-Api-Key": ""
},
json={
"query": "tell me about OINP",
"search_mode": "advanced",
"search_settings": {
"use_hybrid_search": false,
"use_semantic_search": true,
"limit": 50,
"chunk_settings": {
"index_measure": "cosine_distance",
"enabled": true
},
"graph_settings": {
"enabled": true
},
"filters": {
"collection_id": "f96b63d4-3f74-496b-8aaa-e2f0bab687ed"
}
},
"rag_generation_config": {
"model": "anthropic/claude-3-7-sonnet-20250219",
"temperature": 1,
"top_p": null,
"max_tokens_to_sample": 16000,
"stream": false,
"extended_thinking": true,
"thinking_budget": 4096
},
"include_title_if_available": true,
"include_web_search": false
},
)
print(response.json())
Below is a screenshot taken on April 23, 2025, at 7:00 AM GMT (12:21 PM IST):
Payload:
Response:
I have a feeling my change here fixed this. Please let me know if not!
Hi @NolanTrem,
Here’s what I’m seeing now: the payload remains the same—I only updated the prompt for testing.
As you can see the stream value is still false, even though the response includes a blob with status code504:
blob:https://r2r-docs.sciphi.ai/71add823-3509-41c1-82e5-c8c56a60847c
Screenshot:
API Explorer with Updated Prompt, Same Payload:
Blob:
My New Prompt:
Mantra has a Bachelors degree in Computer science engineering in India and has a pg diploma from Loyalist college Ontario. He is currently on his PGWP with an overall score of 7 bands no less than 6.5 in any section. He is currently a store manager full time for a Retail store and 4 months of work experience as Machine operation, again on his PGWP. He is also married. His wife has a post secondary (Grade 12) level of education in India and has studied a diploma in biotechnology program from Loyalist college. She is currently working as a Home Support Worker (44101) in Brockville, Ontario. She is a permanent employee but she doesn't work regularly more than 30 hrs a week so not sure if she qualifies for full time. She has almost 10 months of work experience. Now, can you tell me what are the best options for them to get PR in the most optimal and efficient manner possible? Give a clear roadmap for the same.