hub
hub copied to clipboard
Problem resuming training in Google Colab
Search before asking
- [X] I have searched the HUB issues and found no similar bug report.
HUB Component
Training
Bug
I am training a model using google colab (it is not the first model I train in this way) and when I try to resume executing the commands:
%pip install ultralytics # install
from ultralytics import YOLO, checks, hub
checks() # checks
hub.login('my_API_KEY')
model = YOLO('my_MODEL_ID')
results = model.train()
the following error message appears:
requirements: Ultralytics requirement ['hub-sdk>=0.0.6'] not found, attempting AutoUpdate...
Collecting hub-sdk>=0.0.6
Downloading hub_sdk-0.0.8-py3-none-any.whl (40 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.9/40.9 kB 2.4 MB/s eta 0:00:00
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from hub-sdk>=0.0.6) (2.31.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->hub-sdk>=0.0.6) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->hub-sdk>=0.0.6) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->hub-sdk>=0.0.6) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->hub-sdk>=0.0.6) (2024.2.2)
Installing collected packages: hub-sdk
Successfully installed hub-sdk-0.0.8
requirements: AutoUpdate success ✅ 6.0s, installed 1 package: ['hub-sdk>=0.0.6']
requirements: ⚠️ Restart runtime or rerun command for updates to take effect
Ultralytics HUB: New authentication successful ✅
Ultralytics HUB: View model at https://hub.ultralytics.com/models/6SUZnsAo0z0y6gld7lpp 🚀
Downloading https://storage.googleapis.com/ultralytics-hub.appspot.com/users/gR39oPibZKaU7n6mUI0WE1H1CQH2/models/6SUZnsAo0z0y6gld7lpp/epoch-32.pt to 'epoch-32.pt'...
⚠️ Download failure, retrying 1/3 https://storage.googleapis.com/ultralytics-hub.appspot.com/users/gR39oPibZKaU7n6mUI0WE1H1CQH2/models/6SUZnsAo0z0y6gld7lpp/epoch-32.pt?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=firebase-adminsdk-jsjt9%40ultralytics-hub.iam.gserviceaccount.com%2F20240430%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20240430T070930Z&X-Goog-Expires=900&X-Goog-SignedHeaders=host&X-Goog-Signature=5bc8969abb8a1e1ee6b7518609a6a883f6276e7f9ee851ce01edf394f85b58a1e595a775690399a9f569b14cbfe7b3fd299049484b41a34e9cdc8002ed711d399bd0d2b61c01776b258a87ba3bf78b786a522e601f1413e508d8e3d61c6f0d89e76fe6cdc64a0b8e726cf24b0c701c9a6a679cce954bd385cd4714d92ba336c9bb6faea48f3bcb3eecfecdaa7e1fb7b4316bc34d042a31c79f79c4ea764d54e3632132246cbe6e9f37d494f87f9361d0251673517fbd03a6522650f9c3cfedfaf96526ef8f4a64a1da97e8d7904493489c484e339b72390012ad33d5faf7c172e810364057072fd535abb3f4368e96b0a8aa132eb2402b5b93959369719b5e59...
---------------------------------------------------------------------------
UnpicklingError Traceback (most recent call last)
[<ipython-input-2-600de03de6f2>](https://localhost:8080/#) in <cell line: 3>()
1 hub.login('67f19bbd86bcc04db7747d501c4e11246ac092e81a')
2
----> 3 model = YOLO('https://hub.ultralytics.com/models/6SUZnsAo0z0y6gld7lpp')
4 results = model.train()
6 frames
[/usr/local/lib/python3.10/dist-packages/ultralytics/models/yolo/model.py](https://localhost:8080/#) in __init__(self, model, task, verbose)
21 else:
22 # Continue with default YOLO initialization
---> 23 super().__init__(model=model, task=task, verbose=verbose)
24
25 @property
[/usr/local/lib/python3.10/dist-packages/ultralytics/engine/model.py](https://localhost:8080/#) in __init__(self, model, task, verbose)
149 self._new(model, task=task, verbose=verbose)
150 else:
--> 151 self._load(model, task=task)
152
153 def __call__(
[/usr/local/lib/python3.10/dist-packages/ultralytics/engine/model.py](https://localhost:8080/#) in _load(self, weights, task)
238
239 if Path(weights).suffix == ".pt":
--> 240 self.model, self.ckpt = attempt_load_one_weight(weights)
241 self.task = self.model.args["task"]
242 self.overrides = self.model.args = self._reset_ckpt_args(self.model.args)
[/usr/local/lib/python3.10/dist-packages/ultralytics/nn/tasks.py](https://localhost:8080/#) in attempt_load_one_weight(weight, device, inplace, fuse)
804 def attempt_load_one_weight(weight, device=None, inplace=True, fuse=False):
805 """Loads a single model weights."""
--> 806 ckpt, weight = torch_safe_load(weight) # load ckpt
807 args = {**DEFAULT_CFG_DICT, **(ckpt.get("train_args", {}))} # combine model and default args, preferring model args
808 model = (ckpt.get("ema") or ckpt["model"]).to(device).float() # FP32 model
[/usr/local/lib/python3.10/dist-packages/ultralytics/nn/tasks.py](https://localhost:8080/#) in torch_safe_load(weight)
730 }
731 ): # for legacy 8.0 Classify and Pose models
--> 732 ckpt = torch.load(file, map_location="cpu")
733
734 except ModuleNotFoundError as e: # e.name is missing module name
[/usr/local/lib/python3.10/dist-packages/torch/serialization.py](https://localhost:8080/#) in load(f, map_location, pickle_module, weights_only, mmap, **pickle_load_args)
1038 except RuntimeError as e:
1039 raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None
-> 1040 return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
1041
1042
[/usr/local/lib/python3.10/dist-packages/torch/serialization.py](https://localhost:8080/#) in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
1256 "functionality.")
1257
-> 1258 magic_number = pickle_module.load(f, **pickle_load_args)
1259 if magic_number != MAGIC_NUMBER:
1260 raise RuntimeError("Invalid magic number; corrupt file?")
UnpicklingError: invalid load key, '<'.
Environment
Google Colab
Minimal Reproducible Example
- Login to hub
- Search the model to train
- Click to copy the Colab code
- Follow the steps on the Google Colab notebook
- Error appears
Additional
No response
👋 Hello @sebasmej, thank you for raising an issue about Ultralytics HUB 🚀! Please visit our HUB Docs to learn more:
- Quickstart. Start training and deploying YOLO models with HUB in seconds.
- Datasets: Preparing and Uploading. Learn how to prepare and upload your datasets to HUB in YOLO format.
- Projects: Creating and Managing. Group your models into projects for improved organization.
- Models: Training and Exporting. Train YOLOv5 and YOLOv8 models on your custom datasets and export them to various formats for deployment.
- Integrations. Explore different integration options for your trained models, such as TensorFlow, ONNX, OpenVINO, CoreML, and PaddlePaddle.
- Ultralytics HUB App. Learn about the Ultralytics App for iOS and Android, which allows you to run models directly on your mobile device.
- Inference API. Understand how to use the Inference API for running your trained models in the cloud to generate predictions.
If this is a 🐛 Bug Report, please provide screenshots and steps to reproduce your problem to help us get started working on a fix.
If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response.
We try to respond to all issues as promptly as possible. Thank you for your patience!
Hello! It seems like there was an issue downloading your model weights from the server, which led to a corrupted file. This can happen due to network connectivity problems or server-side issues occasionally.
Here's a quick checklist to try and resolve this problem:
- Rerun the Training Cell: Sometimes, simply rerunning the command can resolve the issue as it might have been a temporary connectivity problem.
- Check Internet Connection: Ensure your Colab notebook has a stable internet connection. Changing network environments can sometimes help.
- Clear Colab Environment: Restart your Colab runtime and clear any cached data. It's also good practice to delete any corrupted weight files if they've been downloaded.
Should the issue persist after these steps, please open a new issue with details of the error after rerun for further investigation. Sometimes, certain issues might be tied to transient conditions on the server or network, and providing fresh context helps us identify if there's a new problem.
Thank you for reaching out! Your contributions help the community and the development of our platform. 🚀
Closing this issue as it is duplicated by #674.