ultralytics Training works on CPU but not GPU (Windows)

Search before asking

[X] I have searched the YOLOv8 issues and found no similar bug report.

YOLOv8 Component

Training

Bug

When training on GPU the training loop on epochs never show up and the notebook cell keeps running, while on CPU it works fine.

The output of model.train:

Ultralytics YOLOv8.0.25  Python-3.9.16 torch-1.13.1+cu116 CUDA:0 (NVIDIA GeForce RTX 3060 Ti, 8192MiB)
yolo\engine\trainer: task=detect, mode=train, model=yolov8n.yaml, data=datasets\football-players-detection-4\data.yaml, epochs=2, patience=50, batch=16, imgsz=640, save=True, cache=False, device=0, workers=8, project=None, name=None, exist_ok=False, pretrained=False, optimizer=SGD, verbose=True, seed=0, deterministic=True, single_cls=False, image_weights=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, overlap_mask=True, mask_ratio=4, dropout=False, val=True, save_json=False, save_hybrid=False, conf=0.001, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=ultralytics/assets/, show=False, save_txt=False, save_conf=False, save_crop=False, hide_labels=False, hide_conf=False, vid_stride=1, line_thickness=3, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=17, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.001, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, fl_gamma=0.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0, cfg=None, v5loader=False, save_dir=runs\detect\train15
Overriding model.yaml nc=80 with nc=4

                   from  n    params  module                                       arguments                     
  0                  -1  1       464  ultralytics.nn.modules.Conv                  [3, 16, 3, 2]                 
  1                  -1  1      4672  ultralytics.nn.modules.Conv                  [16, 32, 3, 2]                
  2                  -1  1      7360  ultralytics.nn.modules.C2f                   [32, 32, 1, True]             
  3                  -1  1     18560  ultralytics.nn.modules.Conv                  [32, 64, 3, 2]                
  4                  -1  2     49664  ultralytics.nn.modules.C2f                   [64, 64, 2, True]             
  5                  -1  1     73984  ultralytics.nn.modules.Conv                  [64, 128, 3, 2]               
  6                  -1  2    197632  ultralytics.nn.modules.C2f                   [128, 128, 2, True]           
  7                  -1  1    295424  ultralytics.nn.modules.Conv                  [128, 256, 3, 2]              
  8                  -1  1    460288  ultralytics.nn.modules.C2f                   [256, 256, 1, True]           
  9                  -1  1    164608  ultralytics.nn.modules.SPPF                  [256, 256, 5]                 
 10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 11             [-1, 6]  1         0  ultralytics.nn.modules.Concat                [1]                           
 12                  -1  1    148224  ultralytics.nn.modules.C2f                   [384, 128, 1]                 
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14             [-1, 4]  1         0  ultralytics.nn.modules.Concat                [1]                           
 15                  -1  1     37248  ultralytics.nn.modules.C2f                   [192, 64, 1]                  
 16                  -1  1     36992  ultralytics.nn.modules.Conv                  [64, 64, 3, 2]                
 17            [-1, 12]  1         0  ultralytics.nn.modules.Concat                [1]                           
 18                  -1  1    123648  ultralytics.nn.modules.C2f                   [192, 128, 1]                 
 19                  -1  1    147712  ultralytics.nn.modules.Conv                  [128, 128, 3, 2]              
 20             [-1, 9]  1         0  ultralytics.nn.modules.Concat                [1]                           
 21                  -1  1    493056  ultralytics.nn.modules.C2f                   [384, 256, 1]                 
 22        [15, 18, 21]  1    752092  ultralytics.nn.modules.Detect                [4, [64, 128, 256]]           
Model summary: 225 layers, 3011628 parameters, 3011612 gradients, 8.2 GFLOPs

Transferred 319/355 items from pretrained weights
optimizer: SGD(lr=0.01) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.001), 63 bias
train: Scanning C:\Data\DS\0-repos\DL-codebase\Computer-vision\object-detection\yolov8\datasets\football-players-detection-4\train\labels.cache... 204 images, 0 backgrounds, 0 corrupt: 100%|██████████| 204/204 [00:00<?, ?it/s]
val: Scanning C:\Data\DS\0-repos\DL-codebase\Computer-vision\object-detection\yolov8\datasets\football-players-detection-4\valid\labels.cache... 38 images, 0 backgrounds, 0 corrupt: 100%|██████████| 38/38 [00:00<?, ?it/s]
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs\detect\train15
Starting training for 2 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size

It stays stuck like this with no other prints or outputs.

Environment

YOLO: Ultralytics YOLOv8.0.25 Python-3.9.16 torch-1.13.1+cu116 CUDA:0 (NVIDIA GeForce RTX 3060 Ti, 8192MiB)
Windows 11
Python: 3.9.16

Minimal Reproducible Example

from ultralytics import YOLO from pathlib import Path from roboflow import Roboflow

rf = Roboflow(api_key="UxApJs5oZUkmngdc3qdV") project = rf.workspace("roboflow-jvuqo").project("football-players-detection-3zvbc") dataset = project.version(4).download("yolov8")

path_yaml = ""

model = YOLO("yolov8n.pt")

model.train(data=path_yaml, epochs=2, imgsz=640, device=0)

Additional

No response

Are you willing to submit a PR?

[ ] Yes I'd like to help by submitting a PR!

Jan 31 '23 22:01 Darkmyter

Thanks for reporting! Is this reproducible for you on your system or on colab?

Feb 01 '23 05:02 AyushExel

It is reproducible on my system. I will try it on colab.

Feb 01 '23 06:02 Darkmyter

On Colab the bug is not reproducible, I had no issue running the training on GPU.

Feb 01 '23 07:02 Darkmyter

I ran into this same issue.

I resolved by starting fresh with a new conda env, cloned the repo, using conda to install pytorch and torchvision, then install the rest of dependencies manually. I did not pip install ultralytics. they use pip to install pytorch and torchvision which can cause issues.

so:

start fresh with new conda env (python=3.10)
conda install pytorch torchvision pytorch-cuda=11.7 -c pytorch -c nvidia (or whatever cuda version you have)
pip install opencv-python matplotlib tqdm ....

Then I pulled in the cloned repo tools

import sys
sys.path.append("../ultralytics")

from ultralytics import YOLO

if __name__ == "__main__":
    model = YOLO("yolov8x.pt")  # pass any model type
    model.train(epochs=5, imgsz=640, device=0, data="face.yaml")

Again, I did not use pip install ultralytics. That was the culprit in my case.

Feb 02 '23 00:02 mkutu

@mkutu Thank you for your input.I will try this too and post here if it works for me too.

Feb 02 '23 09:02 Darkmyter

@mkutu Excellent ! this saved me a lot of trouble! thank you !

FYI - my steps to get YoloV8 working on my Windows 10 machine with CUDA 11.2

started with a fresh venv (python -m venv somepath) and activated this (somepath\scripts\activate.bat)
manually installed pytorch corresponding to the CUDA 11.2 which I have installed: ==> went to (https://pytorch.org/get-started/previous-versions/) and selected the corresponding version ==> in my case I ran: pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html ...because 11.1 is close enough :)
downloaded this repo
pip install -r requirements.txt
pip install sentry_sdk (which was missing for some reason)
in python:

Python 3.7.8 (tags/v3.7.8:4b47a5b6ba, Jun 28 2020, 08:53:46) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from ultralytics import YOLO
>>> model = YOLO("yolov8s.pt")
>>> results = model.predict(source="D:/Downloads/TestPredict", save=True)
Ultralytics YOLOv8.0.26  Python-3.7.8 torch-1.8.0+cu111 CUDA:0 (NVIDIA GeForce RTX 2060 SUPER, 8192MiB)
YOLOv8s summary (fused): 168 layers, 11156544 parameters, 0 gradients, 28.6 GFLOPs
image 1/942 D:\Downloads\TestPredict\FnpOBBzX0AMMskT.jpg: 416x640 2 persons, 1 tie, 24.9ms
image 2/942 D:\Downloads\TestPredict\636938981690803908.jpg: 512x640 4 persons, 25.0ms
image 3/942 D:\Downloads\TestPredict\636944694197651301.jpg: 640x448 3 persons, 2 ties, 1 dining table, 22.0ms
image 4/942 D:\Downloads\TestPredict\636944694765375786.jpg: 448x640 5 persons, 22.9ms
image 5/942 D:\Downloads\TestPredict\636945003595140006.jpg: 640x480 2 persons, 23.9ms

Feb 04 '23 13:02 LambertWM

@Darkmyter FWIW, I've found the most reliable way to train on Windows is to use the Ultralytics docker image. That way the container is (almost) guaranteed to have all the correct dependencies. I train on Win11 with a 3060/16BG using Docker desktop/WSL2 and I've never had problems related to the environment.

Feb 16 '23 19:02 sstainba

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

Mar 19 '23 00:03 github-actions[bot]

just use linux

Mar 13 '24 10:03 sav0l

@sav0l thanks for the suggestion! 🐧 Linux indeed offers a great environment for deep learning and YOLOv8 development. For those who can switch or dual-boot, it's a solid choice. However, we aim to support a wide range of users, including those on Windows. If anyone encounters issues on Windows, we're here to help troubleshoot!

Mar 13 '24 23:03 glenn-jocher