Local File Serving Not Working in Docker Container Despite Correct Environment Variables
Describe the bug I'm running Label Studio inside a Docker container using docker-compose. I've set up environment variables to access data from local files (linked to a volume). The files exist when checking within the container, but I cannot access them through URLs from the browser or within the container.
To Reproduce Steps to reproduce the behavior:
- Create a docker-compose.yml with the following Label Studio service configuration:
labelstudio:
image: heartexlabs/label-studio:latest
ports:
- "4999:8000"
depends_on:
- database
environment:
- LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
- LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/label-studio/data
- LABEL_STUDIO_BASE_DATA_DIR=/label-studio/data/
- LABEL_STUDIO_CORS_ORIGIN=*
- LOG_LEVEL=DEBUG
volumes:
- label_studio_mydata:/label-studio/data:rw
- documents_dataset:/label-studio/data/raw_datasets/documents_dataset:rw
command: label-studio-uwsgi
-
Start the Label Studio container.
-
Inside the container, create a test file:
echo "this is fake image" > /label-studio/data/raw_datasets/documents_dataset/document1/document1_Page_01.jpg 2.Attempt to access the file via browser: http://localhost:4999/data/local-files/?d=raw_datasets/documents_dataset/document1/document1_Page_01.jpg 2.Attempt to access the file from within the container: Copy curl -v 'http://localhost:8000/data/local-files/?d=raw_datasets/documents_dataset/document1/document1_Page_01.jpg'
Expected behavior The file should be accessible via the provided URLs. Actual behavior
Browser access fails Curl command from within the container fails to retrieve the file
Environment information:
Label Studio Version: Label Studio version: 1.12.1
Additional context
Environment variables are correctly set within the container: LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/label-studio/data LABEL_STUDIO_BASE_DATA_DIR=/label-studio/data/
The files are present and accessible within the container when checked directly. Unable to find Label Studio configuration file (/label-studio/data/label_studio_config.json) within the container. Unable to locate or access Label Studio log files within the container.
Attempted troubleshooting:
Verified file permissions within the container Checked environment variables Attempted to access a simple text file using curl within the container (failed)
Any assistance in resolving this issue would be greatly appreciated.
Issue Summary and Temporary Solution, issue not well understood
Update:
The issue originates from the function localfiles_data in label_studio\core\views.py.
Original Code:
def localfiles_data(request):
"""Serving files for LocalFilesImportStorage"""
user = request.user
path = request.GET.get('d')
if settings.LOCAL_FILES_SERVING_ENABLED is False:
return HttpResponseForbidden(
"Serving local files can be dangerous, so it's disabled by default. "
'You can enable it with LOCAL_FILES_SERVING_ENABLED environment variable, '
'please check docs: https://labelstud.io/guide/storage.html#Local-storage'
)
local_serving_document_root = settings.LOCAL_FILES_DOCUMENT_ROOT
if path and request.user.is_authenticated:
path = posixpath.normpath(path).lstrip('/')
full_path = Path(safe_join(local_serving_document_root, path))
user_has_permissions = False
# Try to find Local File Storage connection based prefix:
# storage.path=/home/user, full_path=/home/user/a/b/c/1.jpg =>
# full_path.startswith(path) => True
localfiles_storage = LocalFilesImportStorage.objects.annotate(
_full_path=Value(os.path.dirname(full_path), output_field=CharField())
).filter(_full_path__startswith=F('path'))
if localfiles_storage.exists():
user_has_permissions = any(storage.project.has_permission(user) for storage in localfiles_storage)
if user_has_permissions and os.path.exists(full_path):
content_type, encoding = mimetypes.guess_type(str(full_path))
content_type = content_type or 'application/octet-stream'
return RangedFileResponse(request, open(full_path, mode='rb'), content_type)
else:
return HttpResponseNotFound()
return HttpResponseForbidden()
Problem:
The localfiles_storage.exists() method returns False on the remote host, which causes user_has_permissions to remain False. Consequently, the function returns a 404 response. This behavior differs from the local environment where localfiles_storage.exists() returns True.
Resolution:
As a temporary solution, I set the default value of user_has_permissions to True. This allows the function to check for the file and send it back correctly.
Updated Code:
def localfiles_data(request):
"""Serving files for LocalFilesImportStorage"""
user = request.user
path = request.GET.get('d')
if settings.LOCAL_FILES_SERVING_ENABLED is False:
return HttpResponseForbidden(
"Serving local files can be dangerous, so it's disabled by default. "
'You can enable it with LOCAL_FILES_SERVING_ENABLED environment variable, '
'please check docs: https://labelstud.io/guide/storage.html#Local-storage'
)
local_serving_document_root = settings.LOCAL_FILES_DOCUMENT_ROOT
if path and request.user.is_authenticated:
path = posixpath.normpath(path).lstrip('/')
full_path = Path(safe_join(local_serving_document_root, path))
user_has_permissions = True # Temporary solution
# Try to find Local File Storage connection based prefix:
localfiles_storage = LocalFilesImportStorage.objects.annotate(
_full_path=Value(os.path.dirname(full_path), output_field=CharField())
).filter(_full_path__startswith=F('path'))
if localfiles_storage.exists():
user_has_permissions = any(storage.project.has_permission(user) for storage in localfiles_storage)
if user_has_permissions and os.path.exists(full_path):
content_type, encoding = mimetypes.guess_type(str(full_path))
content_type = content_type or 'application/octet-stream'
return RangedFileResponse(request, open(full_path, mode='rb'), content_type)
else:
return HttpResponseNotFound()
return HttpResponseForbidden()
Notes:
- This change ensures that
user_has_permissionsdefaults toTrue, allowing the function to proceed with file serving if the file exists. - This is a temporary fix. The underlying issue causing
localfiles_storage.exists()to returnFalseon the remote host should be further investigated. Potential causes might include differences in database initialization, file paths, or environment configurations between local and remote environments.
Hi @bsc001 - did you connect an import storage of the Local Files type to a project you're working on, as shown in the screenshot below? That connection is what will make localfiles_storage.exists() return True; really it's just looking for a LocalFilesImportStorage that your user has access to, and with the appropriate path field on the local storage object.
Yes i attached a folder there, but why it is not returing the list of files there ? ..