label-studio-ml-backend icon indicating copy to clipboard operation
label-studio-ml-backend copied to clipboard

Start Training has a long delay with no feedback

Open SheldonWBM opened this issue 7 months ago • 1 comments

When we click the "Start Training" button, we must wait roughly 7-10 minutes before training starts. A longer delay for a project with more annotations. We get no feedback (except the waiting for response timer in the POST curl command) or log from the backend indicating that it is doing something. The frontend "Start Training" button does not animate indicating that it has been clicked and is waiting similar to the "Export" button. The delay is understandable if it takes a long time to load pre-training Tasks/Data. However, an indicator, or a log message in a label-studio-ml example, would be helpful. The terminal log is below:

Adding autoShape... 
2023-12-05 03:34:34,862 - utils.torch_utils - INFO - func: select_device - file: torch_utils.py:85
YOLOR 🚀 2023-10-2 torch 2.1.1+cu121 CUDA:0 (xxxx)

[pid: 29|app: 0|req: 4/4] 172.1.0.5 () {36 vars in 728 bytes} [Tue Dec  5 03:34:32 2023] POST /setup => generated 28 bytes in 2409 msecs (HTTP/1.1 200) 2 headers in 71 bytes (1 switches on core 0)
2023-12-05 03:41:51,680 - root - INFO - func: __init__ - file: model.py:57
The model initialized with weights: config/checkpoints/xx.pt

Then training starts. From the above log, clicking the "Start Training" from the label-studio frontend button at roughly 3:34 triggers the backend to start training at roughly 3:41

SheldonWBM avatar Dec 05 '23 21:12 SheldonWBM