label-studio icon indicating copy to clipboard operation
label-studio copied to clipboard

Runtime error on Retrieve Predictions (Timeout)

Open jsk1107 opened this issue 1 year ago • 5 comments

Describe the bug After connecting local storage and performing sync or connecting ml-backend and performing Retrieve Predictions, a timeout occurs.

To Reproduce Steps to reproduce the behavior:


  1. go cloud Storage
  2. add local storage
  3. press button sync storage
  4. In progress -> The problem occurs after about 90 seconds(Runtime Error - Gateway TimeOut) -> Still working on the backend.

  1. after connecting ML-backend
  2. In project, Press buttton "Retrieve Predictions".
  3. The frontend gets blocked while ml-backend is running. The problem occurs after about 90 seconds(Runtime Error - Gateway TimeOut).
  4. Still working on the ml-backend.

Screenshots

Environment (please complete the following information):

  • OS: windows10
  • Label Studio Version: 1.13.0

Additional context I tried modifying the config values ​​of nginx(proxy setting), but it didn't work. I think a timeout occurred on the frontend.

Please help me.

jsk1107 avatar Sep 23 '24 10:09 jsk1107

Please check this: https://labelstud.io/guide/troubleshooting#Label-Studio-default-timeout-settings-for-ML-server-requests

makseq avatar Sep 23 '24 13:09 makseq

@makseq Yes. I checked that page.

I changed ML_TIMEOUT_PREDICT from 100 to 1000. But It doesn't worked. And the change in ML_TIMEOUT_PREDICT value in ML_server that you mentioned seems to be unrelated to the timeout that occurs when syncing storage.

In my case, the same timeout occurs when exporting annotation.

I set the harakiri value(at uwsgi.ini) to 0, but it still doesn't occur timeout.

jsk1107 avatar Sep 23 '24 13:09 jsk1107

@makseq Yes. I checked that page.

I changed ML_TIMEOUT_PREDICT from 100 to 1000. But It doesn't worked. And the change in ML_TIMEOUT_PREDICT value in ML_server that you mentioned seems to be unrelated to the timeout that occurs when syncing storage.

In my case, the same timeout occurs when exporting annotation.

I set the harakiri value(at uwsgi.ini) to 0, but it still doesn't occur timeout.

Hi. I have same issue (90 seconds time-out) when exporting anotation. Any solution? thanks.

hyp530 avatar Mar 02 '25 17:03 hyp530

Hi. same problem here. Any update ?

cyril-aiherd avatar Apr 22 '25 13:04 cyril-aiherd

I’ve encountered the same issue. A single task doesn’t time out, but when there are many tasks, it does. The API mentioned in the documentation doesn’t seem to directly retrieve predictions form ml_backend, or maybe I misunderstood it.

Currently, I’ve written a script as a workaround. Since get task will trigger the ml_backend, I wrote a script to call that api for all tasks without a prediction. I didn't wrote any retry mechanism, so if some fail, you can run it again.

Hope this helps.

import math
import requests
from concurrent.futures import ThreadPoolExecutor, as_completed
from tqdm import tqdm

LABEL_STUDIO_URL = 'http://xxx.xxx.xxx.xxx:xxxx' # Url and Port
API_KEY = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' # api key
PROJECT_ID = xx # project id
WORKERS = 8
HEADERS = {'Authorization': f'Token {API_KEY}'}

def get_project_info():
    url = f"{LABEL_STUDIO_URL}/api/tasks"
    project_id = PROJECT_ID
    params = {
            'project': project_id,
            'include': 'id',
            'page_size': 1,
        }
    response = requests.get(url, headers=HEADERS, params=params)
    response.raise_for_status()
    response = response.json()
    return response["total"], response["total_annotations"], response["total_predictions"],

def get_all_task_ids(page_size = 100):
    url = f"{LABEL_STUDIO_URL}/api/tasks"
    project_id = PROJECT_ID
    total, total_annotations, total_predictions = get_project_info()
    total_pages = math.ceil(total / page_size)
    print(f"Total tasks: {total}. Total pages: {total_pages}.")

    all_ids = []
    page = 1
    for page in tqdm(range(1, total_pages + 1)):
        params = {
            'project': project_id,
            'include': 'id,total_predictions',
            'page_size': page_size,
            'page': page
        }
        url = f"{LABEL_STUDIO_URL}/api/tasks"
        response = requests.get(url, headers=HEADERS, params=params)
        response.raise_for_status()
        tasks = response.json()["tasks"]
        if not tasks:
            break
        all_ids.extend([task['id'] for task in tasks if task["total_predictions"]==0])
        page += 1
    return all_ids

def fetch_task_detail(task_id):
    url = f"{LABEL_STUDIO_URL}/api/tasks/{task_id}/"
    try:
        response = requests.get(url, headers=HEADERS)
        response.raise_for_status()
        return response.json()
    except requests.RequestException as e:
        print(f"Error fetching task {task_id}: {e}")
        return None

task_ids = get_all_task_ids()

total, total_annotations, total_predictions = get_project_info()
print(f"Total predictions: {total_predictions}")

results = []
with ThreadPoolExecutor(max_workers=WORKERS) as executor:
    futures = {executor.submit(fetch_task_detail, task_id): task_id for task_id in task_ids}
    for future in tqdm(as_completed(futures), total=len(futures), desc="Fetching tasks"):
        result = future.result()
        if result:
            results.append(result)

total, total_annotations, total_predictions = get_project_info()
print(f"Total predictions: {total_predictions}")

ClarkeAC avatar May 15 '25 09:05 ClarkeAC