label-studio-ml-backend
label-studio-ml-backend copied to clipboard
Can't add a supposedly healthy ml backend to LabelStudio using docker compose.
Hello,
I am using docker-compose to run Label Studio and my custom ml backend. Containers seem to run fine, and my backend passes "health" check - if i type in browser http://localhost:9090/health I am getting status "UP". However, when trying to "Validate and Save" model in Label Studio I am getting error:
app_1 | {"asctime": "09/Apr/2023:14:20:41 +0000", "name": "urllib3.connectionpool", "funcName": "urlopen", "lineno": 812, "levelname": "WARNING", "user_id": null, "message": "Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6577060f40>: Failed to establish a new connection: [Errno 111] Connection refused')': /health", "request_id": null}
app_1 |
app_1 | {"asctime": "09/Apr/2023:14:20:41 +0000", "name": "urllib3.connectionpool", "funcName": "urlopen", "lineno": 812, "levelname": "WARNING", "user_id": null, "message": "Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f65770610f0>: Failed to establish a new connection: [Errno 111] Connection refused')': /health", "request_id": null}
app_1 |
app_1 | {"asctime": "09/Apr/2023:14:20:41 +0000", "name": "core.utils.common", "funcName": "custom_exception_handler", "lineno": 81, "levelname": "ERROR", "user_id": null, "message": "0a91039c-66b8-46b7-871a-d6945e773544 {'non_field_errors': [ErrorDetail(string='Can\\'t connect to ML backend http://localhost:9090, health check failed. Make sure it is up and your firewall is properly configured. <a href=\"https://labelstud.io/guide/ml.html>Learn more</a> about how to set up an ML backend. Additional info:HTTPConnectionPool(host=\\'localhost\\', port=9090): Max retries exceeded with url: /health (Caused by NewConnectionError(\\'<urllib3.connection.HTTPConnection object at 0x7f6577061420>: Failed to establish a new connection: [Errno 111] Connection refused\\'))', code='invalid')]}", "exc_info": "Traceback (most recent call last):
File \"/usr/local/lib/python3.10/dist-packages/rest_framework/views.py\", line 506, in dispatch
response = handler(request, *args, **kwargs)
File \"/usr/local/lib/python3.10/dist-packages/django/utils/decorators.py\", line 43, in _wrapper
return bound_method(*args, **kwargs)
File \"/usr/local/lib/python3.10/dist-packages/rest_framework/generics.py\", line 242, in post
return self.create(request, *args, **kwargs)
File \"/usr/local/lib/python3.10/dist-packages/rest_framework/mixins.py\", line 18, in create
serializer.is_valid(raise_exception=True)
File \"/usr/local/lib/python3.10/dist-packages/rest_framework/serializers.py\", line 235, in is_valid
raise ValidationError(self.errors)
rest_framework.exceptions.ValidationError: {'non_field_errors': [ErrorDetail(string='Can\\'t connect to ML backend http://localhost:9090, health check failed. Make sure it is up and your firewall is properly configured. <a href=\"https://labelstud.io/guide/ml.html>Learn more</a> about how to set up an ML backend. Additional info:HTTPConnectionPool(host=\\'localhost\\', port=9090): Max retries exceeded with url: /health (Caused by NewConnectionError(\\'<urllib3.connection.HTTPConnection object at 0x7f6577061420>: Failed to establish a new connection: [Errno 111] Connection refused\\'))', code='invalid')]}", "request_id": null}
app_1 |
app_1 | {"asctime": "09/Apr/2023:14:20:41 +0000", "name": "django.request", "funcName": "log_response", "lineno": 224, "levelname": "WARNING", "user_id": null, "message": "Bad Request: /api/ml/", "status_code": 400, "request": "<WSGIRequest: POST '/api/ml/'>", "request_id": null}
app_1 |
app_1 | {"asctime": "09/Apr/2023:14:20:41 +0000", "name": "django.request", "funcName": "log_response", "lineno": 224, "levelname": "WARNING", "user_id": null, "message": "Bad Request: /api/ml/", "status_code": 400, "request": "<WSGIRequest: POST '/api/ml/'>", "request_id": null}
app_1 |
nginx_1 | {"timestamp":"1681050041050","http":{"method":"POST","request_id":"ab5b5c1ae881f29c3d131e0a5b2db9f3","status_code":400,"content_type":"application/json","useragent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0","referrer":"http://localhost:8080/projects/1/settings/ml","x_forwarded_for":"","url":"/api/ml","version":"HTTP/1.1","connection":"148","connection_requests":"10"},"network":{"bytes_written":1488,"bytes_read":1064,"client":{"ip":"172.18.0.1","port":50188},"destination":{"ip":"172.18.0.4","port":8085},"nginx":{"request_time":"0.031","upstream_connect_time":"0.000","upstream_response_time":"0.030","upstream_header_time":"0.030"}}}
I don't know what exactly validation does, maybe it tries to load my model and fails there but the exception is not propagated? How to debug further?
ml backend server shows no issues:
redis | 1:C 09 Apr 2023 14:18:17.718 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo redis | 1:C 09 Apr 2023 14:18:17.718 # Redis version=7.0.10, bits=64, commit=00000000, modified=0, pid=1, just started redis | 1:C 09 Apr 2023 14:18:17.718 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf redis | 1:M 09 Apr 2023 14:18:17.718 * monotonic clock: POSIX clock_gettime redis | 1:M 09 Apr 2023 14:18:17.719 * Running mode=standalone, port=6379. redis | 1:M 09 Apr 2023 14:18:17.719 # Server initialized redis | 1:M 09 Apr 2023 14:18:17.719 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. redis | 1:M 09 Apr 2023 14:18:17.719 * Loading RDB produced by version 7.0.10 redis | 1:M 09 Apr 2023 14:18:17.719 * RDB age 1218 seconds redis | 1:M 09 Apr 2023 14:18:17.719 * RDB memory usage when created 0.85 Mb redis | 1:M 09 Apr 2023 14:18:17.719 * Done loading RDB, keys loaded: 0, keys expired: 0. redis | 1:M 09 Apr 2023 14:18:17.719 * DB loaded from disk: 0.000 seconds redis | 1:M 09 Apr 2023 14:18:17.719 * Ready to accept connections
I did test this also on one of the examples "dummy_model" and the behaviour is the same.
Tried command curl -X GET http://localhost:9090/health
from inside of Label Studio container and got "curl: (7) Failed to connect to localhost port 9090 after 0 ms: Connection refused"
Fixed it by creating an external network in docker-compose.yml of dummy_model. That network is label studio network and then I can reach the container by using its direct ip, not localhost. I don't think this is the correct solution.
What is interesting is that when I am inside label studio container now I can curl -X GET http://localhost:9090/health
but this address is still unreachable when typed in web interface for adding a model.
There is an easier fix. You can specify name of the container instead of the ip. That way the model can link up with the Label studio. In principle you need to give the runtime information about internal network.
I've celebrated prematurely. Now I'm getting "'52f7763bcf2d/data/upload/4/a23ef83a-2023-08-01T06-40-50.png': No scheme supplied. Perhaps you meant https://52f7763bcf2d/data/upload/4/a23ef83a-2023-08-01T06-40-50.png?"
Leaving a comment to add that I'm having the same issue.