super-gradients RuntimeError: stack expects each tensor to be equal size when using YoloDarknetFormatDetectionDataset

🐛 Describe the bug

My dataset is loaded using class YoloDarknetFormatDetectionDataset

# YoloDarknetFormatDetectionDataset
from super_gradients.training.datasets import YoloDarknetFormatDetectionDataset

train_ds = YoloDarknetFormatDetectionDataset(
    data_dir="/workspaces/rocm-ml/yolo/darkent_yolo_ds",
    images_dir="train/images",
    labels_dir="train/labels",
    classes=["lp"],
)

val_ds = YoloDarknetFormatDetectionDataset(
    data_dir="/workspaces/rocm-ml/yolo/darkent_yolo_ds",
    images_dir="val/images",
    labels_dir="val/labels",
    classes=["lp"],
)

my dataset has images with different dimensios (width, heigh)

when executing trainer

train_dataloader = dataloaders.get(dataset=train_ds, dataloader_params={
    "shuffle": True,
    "batch_size": 16,
    "drop_last": False,
    "pin_memory": True,
    "collate_fn": DetectionCollateFN(),
    "worker_init_fn": worker_init_reset_seed,
    "min_samples": 512,
})

val_dataloader = dataloaders.get(dataset=val_ds, dataloader_params={
    "shuffle": False,
    "batch_size": 32,
    "num_workers": 2,
    "drop_last": False,
    "pin_memory": True,
    "collate_fn": DetectionCollateFN(),
    "worker_init_fn": worker_init_reset_seed
})


trainer.train(model=net, training_params=train_params, train_loader=train_dataloader, valid_loader=val_dataloader)

i recieve following error

RuntimeError: stack expects each tensor to be equal size, but got [1536, 2048, 3] at entry 0 and [1080, 1920, 3] at entry 1

What parameters should i use to "transform" or resize my dataset to same dims.

Versions

Collecting environment information... PyTorch version: 2.0.1+rocm5.4.2 Is debug build: False CUDA used to build PyTorch: N/A ROCM used to build PyTorch: 5.4.22803-474e8620

OS: Ubuntu 22.04.2 LTS (x86_64) GCC version: (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0 Clang version: Could not collect CMake version: version 3.25.0 Libc version: glibc-2.35

Python version: 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] (64-bit runtime) Python platform: Linux-6.3.5-060305-generic-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: AMD Radeon Graphics Nvidia driver version: Could not collect cuDNN version: Could not collect HIP runtime version: 5.4.22803 MIOpen runtime version: 2.19.0 Is XNNPACK available: True

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: AuthenticAMD Model name: AMD Ryzen 7 5700X 8-Core Processor CPU family: 25 Model: 33 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 Stepping: 2 Frequency boost: enabled CPU max MHz: 4661.7178 CPU min MHz: 2200.0000 BogoMIPS: 6800.17 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm Virtualization: AMD-V L1d cache: 256 KiB (8 instances) L1i cache: 256 KiB (8 instances) L2 cache: 4 MiB (8 instances) L3 cache: 32 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0-15 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec store bypass: Vulnerable Vulnerability Spectre v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers Vulnerability Spectre v2: Vulnerable, IBPB: disabled, STIBP: disabled, PBRSB-eIBRS: Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected

Versions of relevant libraries: [pip3] numpy==1.23.0 [pip3] pytorch-triton-rocm==2.0.1 [pip3] torch==2.0.1+rocm5.4.2 [pip3] torchaudio==2.0.2+rocm5.4.2 [pip3] torchmetrics==0.8.0 [pip3] torchvision==0.15.2+rocm5.4.2 [conda] Could not collect

Jun 03 '23 13:06 hlacikd

You need to add the transforms process as required by DetectionDataset.

You can also use the dataloader to load coco_detection_yolo_format_train or coco_detection_yolo_format_val.

from super_gradients.training.dataloaders.dataloaders import coco_detection_yolo_format_train, coco_detection_yolo_format_val

dataset_params = {
    'data_dir':'root_dir_to_you_darknet_dataset', # path to your dataset
    'train_images_dir':'images/train', # path to images_dir
    'train_labels_dir':'labels/train', # path to labels_dir
    'val_images_dir':'images/val',
    'val_labels_dir':'labels/val',
    'test_images_dir':'images/test',
    'test_labels_dir':'labels/test',
    'classes': class_names
}

train_loader = coco_detection_yolo_format_train(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['train_images_dir'],
        'labels_dir': dataset_params['train_labels_dir'],
        'classes': dataset_params['classes'],
        'transforms': train_dataset_augmentation,
    },
    dataloader_params={
        'batch_size':36,
        'num_workers':8,
    }
)

valid_loader = coco_detection_yolo_format_val(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['val_images_dir'],
        'labels_dir': dataset_params['val_labels_dir'],
        'classes': dataset_params['classes'],
        'transforms': val_dataset_augmentation,
    },
    dataloader_params={
        'batch_size':36,
        'num_workers':8,
        "pin_memory": True,
    },
)

# train model
trainer.train(model=net, training_params=train_params, train_loader=train_loader, valid_loader=valid_loader)

It will create a DataLoaders from YoloDarknetFormatDetectionDataset with default transforms from your darknet dataset

Jun 08 '23 02:06 haritsahm

As @haritsahm pointed your, you need to use transforms to ensure that all images has the same size. Otherwise it is impossible to combine individual images to a batch.

Aug 10 '23 13:08 BloodAxe

@hlacikd, your full code plz i have same structure same error, fine tuning yolonas on yolo darknet dataset

Sep 15 '23 07:09 khalilxg

im using a yolo/darknet dataset format, what is the value for pretrained_weights= ??

Sep 15 '23 08:09 khalilxg

super-gradients super-gradients copied to clipboard

RuntimeError: stack expects each tensor to be equal size when using YoloDarknetFormatDetectionDataset

🐛 Describe the bug

Versions

super-gradients
super-gradients copied to clipboard