Training works on CPU but not GPU (Windows)
Search before asking
- [X] I have searched the YOLOv8 issues and found no similar bug report.
YOLOv8 Component
Training
Bug
When training on GPU the training loop on epochs never show up and the notebook cell keeps running, while on CPU it works fine.
The output of model.train:
Ultralytics YOLOv8.0.25 Python-3.9.16 torch-1.13.1+cu116 CUDA:0 (NVIDIA GeForce RTX 3060 Ti, 8192MiB)
yolo\engine\trainer: task=detect, mode=train, model=yolov8n.yaml, data=datasets\football-players-detection-4\data.yaml, epochs=2, patience=50, batch=16, imgsz=640, save=True, cache=False, device=0, workers=8, project=None, name=None, exist_ok=False, pretrained=False, optimizer=SGD, verbose=True, seed=0, deterministic=True, single_cls=False, image_weights=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, overlap_mask=True, mask_ratio=4, dropout=False, val=True, save_json=False, save_hybrid=False, conf=0.001, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=ultralytics/assets/, show=False, save_txt=False, save_conf=False, save_crop=False, hide_labels=False, hide_conf=False, vid_stride=1, line_thickness=3, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=17, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.001, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, fl_gamma=0.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0, cfg=None, v5loader=False, save_dir=runs\detect\train15
Overriding model.yaml nc=80 with nc=4
from n params module arguments
0 -1 1 464 ultralytics.nn.modules.Conv [3, 16, 3, 2]
1 -1 1 4672 ultralytics.nn.modules.Conv [16, 32, 3, 2]
2 -1 1 7360 ultralytics.nn.modules.C2f [32, 32, 1, True]
3 -1 1 18560 ultralytics.nn.modules.Conv [32, 64, 3, 2]
4 -1 2 49664 ultralytics.nn.modules.C2f [64, 64, 2, True]
5 -1 1 73984 ultralytics.nn.modules.Conv [64, 128, 3, 2]
6 -1 2 197632 ultralytics.nn.modules.C2f [128, 128, 2, True]
7 -1 1 295424 ultralytics.nn.modules.Conv [128, 256, 3, 2]
8 -1 1 460288 ultralytics.nn.modules.C2f [256, 256, 1, True]
9 -1 1 164608 ultralytics.nn.modules.SPPF [256, 256, 5]
10 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
11 [-1, 6] 1 0 ultralytics.nn.modules.Concat [1]
12 -1 1 148224 ultralytics.nn.modules.C2f [384, 128, 1]
13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
14 [-1, 4] 1 0 ultralytics.nn.modules.Concat [1]
15 -1 1 37248 ultralytics.nn.modules.C2f [192, 64, 1]
16 -1 1 36992 ultralytics.nn.modules.Conv [64, 64, 3, 2]
17 [-1, 12] 1 0 ultralytics.nn.modules.Concat [1]
18 -1 1 123648 ultralytics.nn.modules.C2f [192, 128, 1]
19 -1 1 147712 ultralytics.nn.modules.Conv [128, 128, 3, 2]
20 [-1, 9] 1 0 ultralytics.nn.modules.Concat [1]
21 -1 1 493056 ultralytics.nn.modules.C2f [384, 256, 1]
22 [15, 18, 21] 1 752092 ultralytics.nn.modules.Detect [4, [64, 128, 256]]
Model summary: 225 layers, 3011628 parameters, 3011612 gradients, 8.2 GFLOPs
Transferred 319/355 items from pretrained weights
optimizer: SGD(lr=0.01) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.001), 63 bias
train: Scanning C:\Data\DS\0-repos\DL-codebase\Computer-vision\object-detection\yolov8\datasets\football-players-detection-4\train\labels.cache... 204 images, 0 backgrounds, 0 corrupt: 100%|██████████| 204/204 [00:00<?, ?it/s]
val: Scanning C:\Data\DS\0-repos\DL-codebase\Computer-vision\object-detection\yolov8\datasets\football-players-detection-4\valid\labels.cache... 38 images, 0 backgrounds, 0 corrupt: 100%|██████████| 38/38 [00:00<?, ?it/s]
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs\detect\train15
Starting training for 2 epochs...
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
It stays stuck like this with no other prints or outputs.
Environment
- YOLO: Ultralytics YOLOv8.0.25 Python-3.9.16 torch-1.13.1+cu116 CUDA:0 (NVIDIA GeForce RTX 3060 Ti, 8192MiB)
- Windows 11
- Python: 3.9.16
Minimal Reproducible Example
from ultralytics import YOLO from pathlib import Path from roboflow import Roboflow
rf = Roboflow(api_key="UxApJs5oZUkmngdc3qdV") project = rf.workspace("roboflow-jvuqo").project("football-players-detection-3zvbc") dataset = project.version(4).download("yolov8")
path_yaml = ""
model = YOLO("yolov8n.pt")
model.train(data=path_yaml, epochs=2, imgsz=640, device=0)
Additional
No response
Are you willing to submit a PR?
- [ ] Yes I'd like to help by submitting a PR!
Thanks for reporting! Is this reproducible for you on your system or on colab?
It is reproducible on my system. I will try it on colab.
On Colab the bug is not reproducible, I had no issue running the training on GPU.
I ran into this same issue.
I resolved by starting fresh with a new conda env, cloned the repo, using conda to install pytorch and torchvision, then install the rest of dependencies manually. I did not pip install ultralytics. they use pip to install pytorch and torchvision which can cause issues.
so:
- start fresh with new conda env (python=3.10)
-
conda install pytorch torchvision pytorch-cuda=11.7 -c pytorch -c nvidia(or whatever cuda version you have) -
pip install opencv-python matplotlib tqdm ....
Then I pulled in the cloned repo tools
import sys
sys.path.append("../ultralytics")
from ultralytics import YOLO
if __name__ == "__main__":
model = YOLO("yolov8x.pt") # pass any model type
model.train(epochs=5, imgsz=640, device=0, data="face.yaml")
Again, I did not use pip install ultralytics. That was the culprit in my case.
@mkutu Thank you for your input.I will try this too and post here if it works for me too.
@mkutu Excellent ! this saved me a lot of trouble! thank you !
FYI - my steps to get YoloV8 working on my Windows 10 machine with CUDA 11.2
- started with a fresh venv (
python -m venv somepath) and activated this (somepath\scripts\activate.bat) - manually installed pytorch corresponding to the CUDA 11.2 which I have installed:
==> went to (https://pytorch.org/get-started/previous-versions/) and selected the corresponding version
==> in my case I ran:
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html...because 11.1 is close enough :) - downloaded this repo
-
pip install -r requirements.txt -
pip install sentry_sdk(which was missing for some reason) - in python:
Python 3.7.8 (tags/v3.7.8:4b47a5b6ba, Jun 28 2020, 08:53:46) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from ultralytics import YOLO
>>> model = YOLO("yolov8s.pt")
>>> results = model.predict(source="D:/Downloads/TestPredict", save=True)
Ultralytics YOLOv8.0.26 Python-3.7.8 torch-1.8.0+cu111 CUDA:0 (NVIDIA GeForce RTX 2060 SUPER, 8192MiB)
YOLOv8s summary (fused): 168 layers, 11156544 parameters, 0 gradients, 28.6 GFLOPs
image 1/942 D:\Downloads\TestPredict\FnpOBBzX0AMMskT.jpg: 416x640 2 persons, 1 tie, 24.9ms
image 2/942 D:\Downloads\TestPredict\636938981690803908.jpg: 512x640 4 persons, 25.0ms
image 3/942 D:\Downloads\TestPredict\636944694197651301.jpg: 640x448 3 persons, 2 ties, 1 dining table, 22.0ms
image 4/942 D:\Downloads\TestPredict\636944694765375786.jpg: 448x640 5 persons, 22.9ms
image 5/942 D:\Downloads\TestPredict\636945003595140006.jpg: 640x480 2 persons, 23.9ms
@Darkmyter FWIW, I've found the most reliable way to train on Windows is to use the Ultralytics docker image. That way the container is (almost) guaranteed to have all the correct dependencies. I train on Win11 with a 3060/16BG using Docker desktop/WSL2 and I've never had problems related to the environment.
👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
- Docs: https://docs.ultralytics.com
- HUB: https://hub.ultralytics.com
- Community: https://community.ultralytics.com
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO 🚀 and Vision AI ⭐
just use linux
@sav0l thanks for the suggestion! 🐧 Linux indeed offers a great environment for deep learning and YOLOv8 development. For those who can switch or dual-boot, it's a solid choice. However, we aim to support a wide range of users, including those on Windows. If anyone encounters issues on Windows, we're here to help troubleshoot!