super-gradients icon indicating copy to clipboard operation
super-gradients copied to clipboard

Regarding the disparity observed between the results shown in the 'yolo_nas_custom_dataset_fine_tuning_with_qat' Colab notebook and the output derived during training on the local system

Open Sumeshbaba opened this issue 1 year ago • 3 comments

💡 Your Question

First off, thanks for this project, works great for general object detection problems!

My question is regarding the results shown on the getting started Google Colab notebook titled 'Quantization Aware Training YOLONAS on Custom Dataset'.

I downloaded this notebook as an .ipynb notebook and ran the notebook without changing any parameter on my local system.

The results for the normal yolo_nas_s after training is almost identical as in the Colab notebook.

But after QAT, the results differ significantly from the colab notebook. Could you let me know what or where I am going wrong? Thank you.

I've attached screenshots of the same.

MytrainingYOLO_NAS_S_comparison Colab_YOLO_NAS_S_comparison MytrainingYOLO_NAS_S_qat Colab_YOLO_NAS_S_qat Colab_YOLO_NAS_S MytrainingYOLO_NAS_S

Versions

PyTorch version: 2.0.1+cu118 Is debug build: False CUDA used to build PyTorch: 11.8 ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Home GCC version: Could not collect Clang version: Could not collect CMake version: Could not collect Libc version: N/A

Python version: 3.8.2 (default, May 6 2020, 09:02:42) [MSC v.1916 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.22621-SP0 Is CUDA available: True CUDA runtime version: 12.3.103 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3060 Nvidia driver version: 546.12 cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\cudnn_ops_train64_8.dll HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Architecture=9 CurrentClockSpeed=2500 DeviceID=CPU0 Family=205 L2CacheSize=7680 L2CacheSpeed= Manufacturer=GenuineIntel MaxClockSpeed=2500 Name=12th Gen Intel(R) Core(TM) i5-12400 ProcessorType=3 Revision=

Versions of relevant libraries: [pip3] numpy==1.23.0 [pip3] pytorch-quantization==2.1.2 [pip3] torch==2.0.1+cu118 [pip3] torchaudio==2.0.2+cu118 [pip3] torchmetrics==0.8.0 [pip3] torchvision==0.15.2+cu118 [conda] libblas 3.9.0 20_win64_mkl conda-forge [conda] libcblas 3.9.0 20_win64_mkl conda-forge [conda] liblapack 3.9.0 20_win64_mkl conda-forge [conda] mkl 2023.2.0 h6a75c08_50497 conda-forge [conda] numpy 1.23.0 pypi_0 pypi [conda] pytorch-quantization 2.1.2 pypi_0 pypi [conda] torch 2.0.1+cu118 pypi_0 pypi [conda] torchaudio 2.0.2+cu118 pypi_0 pypi [conda] torchmetrics 0.8.0 pypi_0 pypi [conda] torchvision 0.15.2+cu118 pypi_0 pypi

Sumeshbaba avatar Dec 26 '23 07:12 Sumeshbaba