hub icon indicating copy to clipboard operation
hub copied to clipboard

Ultralytics Cloud Issue

Open webdevmatt07 opened this issue 9 months ago • 4 comments

Search before asking

Question

I'm having issues with the Ultralytics Cloud. When training my model I get disconnected often. Sometimes I get billed when it disconnects, sometimes I don't. I would really like to have it consistently train and I'm wondering if I'm doing something wrong.

Here are a few screen shots of what I"m seeing

Image

Image

Image

On other occasions I'll get partially through the training on one server, it will disconnect many times and I'll try a new server and sometimes it completes and sometimes it doesn't.

Thank you for your help.

Additional

No response

webdevmatt07 avatar Feb 28 '25 14:02 webdevmatt07

👋 Hello @webdevmatt07, thank you for reporting this issue with Ultralytics HUB 🚀! We're sorry to hear about the disconnection troubles you're experiencing while training your models on the Ultralytics Cloud. Please take a look at our HUB Docs, which can offer helpful information:

  • Quickstart. Get started training and deploying YOLO models with HUB effortlessly.
  • Models: Training and Exporting. Explore detailed steps for training YOLOv5 and YOLOv8 models on custom datasets and exporting them for deployment.
  • Projects: Creating and Managing. Organize your models into projects for better team collaboration.
  • Inference API. Discover how to use the Inference API to generate predictions in the cloud.

If this is a 🐛 Bug Report, could you please provide the following additional details to help us investigate the issue more effectively?

  1. A detailed description of the steps leading up to the disconnection.
  2. Screenshots of any relevant error messages, logs, or events during the disconnections.
  3. A minimum reproducible example (MRE) if possible. This would allow us to replicate the issue and work on a quicker resolution.

If this is a ❓ Question about configuration or usage, please provide details about your model, dataset, training settings, and any customizations you might have applied.

An Ultralytics engineer will review this and assist you further soon. We try to address all issues as promptly as possible. Thank you for your patience and understanding! 😊

UltralyticsAssistant avatar Feb 28 '25 14:02 UltralyticsAssistant

I have done the "Bring your own" training on my linux box with success. I wanted to use the Ultralytics cloud for convenience and speed.

There have not been any errors that I can see on the UI, is there a place to see a log that I could share with you.

webdevmatt07 avatar Feb 28 '25 15:02 webdevmatt07

Thank you for sharing these details about your Ultralytics Cloud Training experience. Let's address this systematically:

  1. Disconnections: Our Cloud Training docs note that training sessions may stop if your account balance drops below $5 during epoch-based training. However, you should always be able to resume from the last checkpoint. The screenshots suggest potential connectivity issues - could you confirm if these disconnections occur at consistent intervals or after specific epochs?

  2. Billing: Charges should only apply for completed epochs (epoch-based training) or elapsed time (timed training). If you're seeing inconsistent deductions, please check your Billing tab for detailed cost breakdowns per session.

  3. Logs: While full training logs aren't currently exposed in the UI, our team can investigate server-side logs. Please email support @webdevmatt07.com with:

    • Exact timestamps of failed sessions
    • Training instance regions used
    • Any patterns you noticed (e.g., specific epoch numbers)
  4. Stability Tip: We recommend using the same instance region for resume operations when possible, as cross-region checkpoints can occasionally introduce latency issues.

The team will prioritize investigating your case - your detailed report with timestamps will help us identify if this is an isolated infrastructure issue or something more systemic. We appreciate your patience as we work to improve Cloud Training reliability. 🚀

pderrenger avatar Feb 28 '25 21:02 pderrenger

Hi @webdevmatt07,

Thank you for reaching out and sharing your experience. I understand how frustrating these disconnections can be.

We are actively working on improving cloud training stability, and your feedback is invaluable. If you could share a model ID for one of your affected training sessions, I can take a closer look and escalate this issue to the development team for further investigation.

Let me know how you’d like to proceed, and I’ll be happy to assist!

yogendrasinghx avatar Mar 18 '25 09:03 yogendrasinghx

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

  • Docs: https://docs.ultralytics.com
  • HUB: https://hub.ultralytics.com
  • Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

github-actions[bot] avatar Jun 07 '25 00:06 github-actions[bot]