hub icon indicating copy to clipboard operation
hub copied to clipboard

Can't upload the dataset. Timeout (No response from the server.) Error

Open mrEuler opened this issue 10 months ago β€’ 2 comments

Search before asking

  • [X] I have searched the HUB issues and found no similar bug report.

HUB Component

Datasets

Bug

I try (already 4th time) to upload big dataset (94.5 GB) to the hub, but getting next behaviour:

  1. In the tab, where uploading started, the dataset has next infinte status: image

  2. Opening new tab with the datasets page, show the dataset with next status: image

Dataset was checked with the python ultralytic's tool. I have Pro subscription.

Environment

  • MacBook Pro M3 Pro
  • Chrome Version 131.0.6778.265 (Official Build) (64-bit)

Minimal Reproducible Example

Just regular datset uploading steps.

Additional

No response

mrEuler avatar Jan 13 '25 18:01 mrEuler

πŸ‘‹ Hello @mrEuler, thank you for raising an issue about the Ultralytics HUB πŸš€! We're sorry for the upload difficulties you're experiencing and appreciate you bringing this to our attention.

To help us investigate further and resolve the issue, could you please provide additional information or validate some details? Since this involves a potential πŸ› bug, it would be great if you could supply a Minimum Reproducible Example (MRE) following these steps from our MRE Guide:

  1. List the exact steps you followed to upload the dataset.
  2. Include any additional logs or terminal/console outputs if available.
  3. Share any relevant dataset details or specifics that might help reproduce this issue (e.g., size, number of files, format verification via our tools, etc.).

While waiting for additional details, you might find the following documentation helpful for reviewing your steps and ensuring everything is set up correctly:

Additionally, if this is related to large dataset handling or a subscription-specific tier, our engineering team will investigate back-end functionalities to ensure everything works smoothly.

Please note this is an automated response πŸ€–, and an Ultralytics engineer will review your issue and assist you as soon as possible. Thank you for your patience and for being part of the Ultralytics community! 😊

UltralyticsAssistant avatar Jan 13 '25 18:01 UltralyticsAssistant

@mrEuler thank you for reporting this issue and providing detailed information about your environment and workflow! Let's address the timeout error you're experiencing.

Possible Causes and Steps to Resolve:

  1. Large Dataset Size (94.5 GB):

    • Uploading a dataset of this size can sometimes trigger timeouts depending on network stability or server load.
    • Recommendation: Consider splitting your dataset into smaller chunks (e.g., 10–20 GB each) and uploading them separately. Ensure that the dataset structure and YAML file remain consistent with YOLO format requirements.
  2. Network Stability:

    • If your connection is unstable, large uploads may fail intermittently.
    • Recommendation: Test your connection stability and speed. If possible, use a wired internet connection to avoid potential interruptions during the upload.
  3. Server-Side Timeout:

    • While the HUB is designed to handle large datasets, server-side timeouts can occur for prolonged uploads.
    • Recommendation: Try uploading the dataset during off-peak hours when the server might experience less traffic.
  4. Dataset Validation:

    • Since you've already validated your dataset using the Python tool (check_dataset), that’s great! However, ensure the dataset ZIP file is properly structured and error-free.
    • Double-Check: Confirm that your dataset YAML, directory, and ZIP file all have the same name (e.g., dataset.yaml, dataset/, dataset.zip).
  5. Browser-Specific Issue:

    • While Chrome is supported, browser extensions or outdated versions might interfere with the upload process.
    • Recommendation: Disable unnecessary browser extensions and ensure Chrome is up to date. Alternatively, try a different browser (e.g., Firefox).

Next Steps:

  • Retry Smaller Uploads: Start with smaller portions of your dataset and verify if the issue persists. This can help isolate whether the size is the primary cause.
  • Monitor the Console: Open your browser's developer tools (F12 > Console tab) to check for any error logs during the upload process. If available, please share the error details here for further troubleshooting.
  • Contact Pro Support: As a Pro subscriber, you are entitled to priority support. If the issue persists, please reach out via the Pro Support page for further assistance.

General Tips for Large Dataset Uploads:

  • Ensure the dataset is zipped optimally with no unnecessary files.
  • Avoid performing other bandwidth-intensive tasks during uploads.
  • Use the Ultralytics HUB SDK for programmatic uploads, which might provide better stability for large files:
    from hub_sdk import HUBClient
    
    credentials = {"api_key": "<YOUR-API-KEY>"}
    client = HUBClient(credentials)
    
    dataset = client.dataset("<Dataset ID>")  # Replace with your actual dataset ID
    dataset.upload_dataset(file="path/to/your/dataset.zip")
    print("Dataset has been uploaded.")
    

If you continue to experience the timeout issue after trying these steps, let us know! We’re here to help ensure your datasets are uploaded successfully 😊.

pderrenger avatar Jan 14 '25 03:01 pderrenger