super-gradients
super-gradients copied to clipboard
ValueError: 'accuracy' is not in list
💡 Your Question
When running a training with YOLONAS, I get an error towards the end of training (or so I presume):
This is the line where the error occurs (line 156 in the code):
trainer.train(
model=model,
training_params=train_params,
train_loader=train_data,
valid_loader=val_data
)
Error Stack:
[2023-06-12 06:04:14] INFO - base_sg_logger.py - [CLEANUP] - Successfully stopped system monitoring process
[2023-06-12 06:04:14] ERROR - sg_trainer_utils.py - Uncaught exception
Traceback (most recent call last):
File /app/run_module.py\, line 186, in <module>
main()
File /app/run_module.py\, line 159, in main
trainer.train(
File /opt/conda/lib/python3.10/site-packages/super_gradients/training/sg_trainer/sg_trainer.py\, line 1240, in train
train_metrics_tuple = self._train_epoch(epoch=epoch, silent_mode=silent_mode)
File /opt/conda/lib/python3.10/site-packages/super_gradients/training/sg_trainer/sg_trainer.py\, line 441, in _train_epoch
loss, loss_log_items = self._get_losses(outputs, targets)
File /opt/conda/lib/python3.10/site-packages/super_gradients/training/sg_trainer/sg_trainer.py\, line 485, in _get_losses
self._init_monitored_items()
File /opt/conda/lib/python3.10/site-packages/super_gradients/training/sg_trainer/sg_trainer.py\, line 499, in _init_monitored_items
self.metric_idx_in_results_tuple = fuzzy_idx_in_list(self.metric_to_watch, self.loss_logging_items_names + get_metrics_titles(self.valid_metrics))
File /opt/conda/lib/python3.10/site-packages/super_gradients/training/utils/utils.py\, line 226, in fuzzy_idx_in_list
return [fuzzy_str(x) for x in lst].index(fuzzy_str(name))
ValueError: 'accuracy' is not in list
Here are the training parameters:
train_params = {
"silent_mode": False,
"average_best_models": True,
"warmup_mode": "linear_epoch_step",
"warmup_initial_lr": 1e-6,
"lr_warmup_epochs": 3,
"initial_lr": 5e-4,
"lr_mode": "cosine",
"cosine_final_lr_ratio": 0.1,
"optimizer": "Adam",
"optimizer_params": {"weight_decay": 0.0001},
"zero_weight_decay_on_bias_and_bn": True,
"ema": True,
"ema_params": {"decay": 0.9, "decay_type": "threshold"},
"max_epochs": args["epoch"],
"mixed_precision": True,
"loss": PPYoloELoss(
use_static_assigner=False,
num_classes=len(yaml_params["names"]),
reg_max=16
),
"valid_metrics_list": [
DetectionMetrics_050(
score_thres=0.1,
top_k_predictions=300,
num_cls=len(yaml_params["names"]),
normalize_targets=True,
post_prediction_callback=PPYoloEPostPredictionCallback(
score_threshold=0.01,
nms_top_k=1000,
max_predictions=300,
nms_threshold=0.7
)
)
],
"metric_to_watch": "[email protected]"
}
Versions
SuperGradient version = 3.1.1 torch = 1.13.1 Python = 3.10
I'm running this in Vertex.ai (as a custom model). So I cannot run the python script.
Update: I have the same error with the following versions
SuperGradient version = 3.1.2
torch = 1.13.1
Python = 3.7
I was not able to reproduce the error using simple test program when running againts SG 3.1.3 or 3.1.2. Here is the code snippet I was using:
from super_gradients import Trainer, setup_device
from super_gradients.common.object_names import Models
from super_gradients.training import models
from super_gradients.training.losses import PPYoloELoss
from super_gradients.training.metrics import DetectionMetrics_050
from super_gradients.training.dataloaders import coco2017_train_yolo_nas, coco2017_val_yolo_nas
from super_gradients.training.models.detection_models.pp_yolo_e import PPYoloEPostPredictionCallback
def main():
root_dir = "C:/DevelopG/Develop/GitHub/Deci/super-gradients-projects/tinycoco"
train_data = coco2017_train_yolo_nas(dataset_params={"data_dir": root_dir}, dataloader_params={"num_workers": 0})
val_data = coco2017_val_yolo_nas(dataset_params={"data_dir": root_dir}, dataloader_params={"num_workers": 0})
print(len(train_data), len(val_data))
model = models.get(Models.YOLO_NAS_S, pretrained_weights="coco").cuda()
train_params = {
"silent_mode": False,
"average_best_models": True,
"warmup_mode": "linear_epoch_step",
"warmup_initial_lr": 1e-6,
"lr_warmup_epochs": 3,
"initial_lr": 5e-4,
"lr_mode": "cosine",
"cosine_final_lr_ratio": 0.1,
"optimizer": "Adam",
"optimizer_params": {"weight_decay": 0.0001},
"zero_weight_decay_on_bias_and_bn": True,
"ema": True,
"ema_params": {"decay": 0.9, "decay_type": "threshold"},
"max_epochs": 10,
"mixed_precision": True,
"loss": PPYoloELoss(
use_static_assigner=False,
num_classes=80,
reg_max=16
),
"valid_metrics_list": [
DetectionMetrics_050(
score_thres=0.1,
top_k_predictions=300,
num_cls=80,
normalize_targets=True,
post_prediction_callback=PPYoloEPostPredictionCallback(
score_threshold=0.01,
nms_top_k=1000,
max_predictions=300,
nms_threshold=0.7
)
)
],
"metric_to_watch": "[email protected]"
}
setup_device()
trainer = Trainer("issue-1162", ckpt_root_dir="checkpoints")
trainer.train(
model=model,
training_params=train_params,
train_loader=train_data,
valid_loader=val_data
)
if __name__ == "__main__":
main()
And here is the full output log
C:\Users\ekhve\.conda\envs\sg-testing\python.exe C:\DevelopG\Develop\GitHub\Deci\super-gradients-projects\issue-1162\main.py
The console stream is logged into C:\Users\ekhve\sg_logs\console.log
[2023-08-10 14:51:40] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it
[2023-08-10 14:51:41] WARNING - redirects.py - NOTE: Redirects are currently not supported in Windows or MacOs.
[2023-08-10 14:51:46] WARNING - env_sanity_check.py - Failed to verify operating system: Deci officially supports only Linux kernels. Some features may not work as expected.
WARNING: Logging before flag parsing goes to stderr.
W0810 14:51:46.874871 3972 env_sanity_check.py:30] Failed to verify operating system: Deci officially supports only Linux kernels. Some features may not work as expected.
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
Caching annotations: 100%|██████████| 32/32 [00:00<00:00, 1692.13it/s]
Caching annotations: 100%|██████████| 6/6 [00:00<00:00, 760.11it/s]
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
1 1
The console stream is now moved to checkpoints\issue-1162/console_Aug10_14_51_48.txt
[2023-08-10 14:51:52] INFO - sg_trainer.py - Using EMA with params {'decay': 0.9, 'decay_type': 'threshold'}
[2023-08-10 14:51:54] INFO - sg_trainer_utils.py - TRAINING PARAMETERS:
- Mode: OFF
- Number of GPUs: 1 (1 available on the machine)
- Dataset size: 30 (len(train_set))
- Batch size per GPU: 25 (batch_size)
- Batch Accumulate: 1 (batch_accumulate)
- Total batch size: 25 (num_gpus * batch_size)
- Effective Batch size: 25 (num_gpus * batch_size * batch_accumulate)
- Iterations per epoch: 1 (len(train_loader))
- Gradient updates per epoch: 1 (len(train_loader) / batch_accumulate)
[2023-08-10 14:51:54] INFO - sg_trainer.py - Started training for 10 epochs (0/9)
Train epoch 0: 100%|██████████| 1/1 [00:05<00:00, 5.89s/it, PPYoloELoss/loss=2.25, PPYoloELoss/loss_cls=1.2, PPYoloELoss/loss_dfl=1.02, PPYoloELoss/loss_iou=0.217, gpu_mem=8.83]
Validation epoch 0: 100%|██████████| 1/1 [00:00<00:00, 1.34it/s]
===========================================================
SUMMARY OF EPOCH 0
├── Training
│ ├── Ppyoloeloss/loss = 2.2521
│ ├── Ppyoloeloss/loss_cls = 1.1971
│ ├── Ppyoloeloss/loss_dfl = 1.0226
│ └── Ppyoloeloss/loss_iou = 0.2175
└── Validation
├── [email protected] = 0.3248
├── [email protected] = 0.6509
├── Ppyoloeloss/loss = 1.4454
├── Ppyoloeloss/loss_cls = 0.7288
├── Ppyoloeloss/loss_dfl = 0.7688
├── Ppyoloeloss/loss_iou = 0.1329
├── [email protected] = 0.2307
└── [email protected] = 0.8663
===========================================================
[2023-08-10 14:52:01] INFO - base_sg_logger.py - Checkpoint saved in checkpoints\issue-1162\ckpt_best.pth
[2023-08-10 14:52:01] INFO - sg_trainer.py - Best checkpoint overriden: validation [email protected]: 0.650887131690979
Train epoch 1: 100%|██████████| 1/1 [00:01<00:00, 1.73s/it, PPYoloELoss/loss=2.17, PPYoloELoss/loss_cls=1.15, PPYoloELoss/loss_dfl=1.02, PPYoloELoss/loss_iou=0.204, gpu_mem=8.68]
Validation epoch 1: 100%|██████████| 1/1 [00:00<00:00, 1.72it/s]
===========================================================
SUMMARY OF EPOCH 1
├── Training
│ ├── Ppyoloeloss/loss = 2.1742
│ │ ├── Best until now = 2.2521 (↘ -0.0779)
│ │ └── Epoch N-1 = 2.2521 (↘ -0.0779)
│ ├── Ppyoloeloss/loss_cls = 1.1544
│ │ ├── Best until now = 1.1971 (↘ -0.0428)
│ │ └── Epoch N-1 = 1.1971 (↘ -0.0428)
│ ├── Ppyoloeloss/loss_dfl = 1.0179
│ │ ├── Best until now = 1.0226 (↘ -0.0046)
│ │ └── Epoch N-1 = 1.0226 (↘ -0.0046)
│ └── Ppyoloeloss/loss_iou = 0.2044
│ ├── Best until now = 0.2175 (↘ -0.0131)
│ └── Epoch N-1 = 0.2175 (↘ -0.0131)
└── Validation
├── [email protected] = 0.2527
│ ├── Best until now = 0.3248 (↘ -0.0721)
│ └── Epoch N-1 = 0.3248 (↘ -0.0721)
├── [email protected] = 0.6063
│ ├── Best until now = 0.6509 (↘ -0.0445)
│ └── Epoch N-1 = 0.6509 (↘ -0.0445)
├── Ppyoloeloss/loss = 1.512
│ ├── Best until now = 1.4454 (↗ 0.0666)
│ └── Epoch N-1 = 1.4454 (↗ 0.0666)
├── Ppyoloeloss/loss_cls = 0.7584
│ ├── Best until now = 0.7288 (↗ 0.0296)
│ └── Epoch N-1 = 0.7288 (↗ 0.0296)
├── Ppyoloeloss/loss_dfl = 0.7988
│ ├── Best until now = 0.7688 (↗ 0.03)
│ └── Epoch N-1 = 0.7688 (↗ 0.03)
├── Ppyoloeloss/loss_iou = 0.1417
│ ├── Best until now = 0.1329 (↗ 0.0088)
│ └── Epoch N-1 = 0.1329 (↗ 0.0088)
├── [email protected] = 0.1782
│ ├── Best until now = 0.2307 (↘ -0.0525)
│ └── Epoch N-1 = 0.2307 (↘ -0.0525)
└── [email protected] = 0.7783
├── Best until now = 0.8663 (↘ -0.0881)
└── Epoch N-1 = 0.8663 (↘ -0.0881)
===========================================================
Train epoch 2: 100%|██████████| 1/1 [00:01<00:00, 1.73s/it, PPYoloELoss/loss=1.95, PPYoloELoss/loss_cls=0.998, PPYoloELoss/loss_dfl=0.921, PPYoloELoss/loss_iou=0.196, gpu_mem=8.72]
Validation epoch 2: 100%|██████████| 1/1 [00:00<00:00, 2.02it/s]
===========================================================
SUMMARY OF EPOCH 2
├── Training
│ ├── Ppyoloeloss/loss = 1.9485
│ │ ├── Best until now = 2.1742 (↘ -0.2258)
│ │ └── Epoch N-1 = 2.1742 (↘ -0.2258)
│ ├── Ppyoloeloss/loss_cls = 0.9977
│ │ ├── Best until now = 1.1544 (↘ -0.1567)
│ │ └── Epoch N-1 = 1.1544 (↘ -0.1567)
│ ├── Ppyoloeloss/loss_dfl = 0.9206
│ │ ├── Best until now = 1.0179 (↘ -0.0973)
│ │ └── Epoch N-1 = 1.0179 (↘ -0.0973)
│ └── Ppyoloeloss/loss_iou = 0.1962
│ ├── Best until now = 0.2044 (↘ -0.0082)
│ └── Epoch N-1 = 0.2044 (↘ -0.0082)
└── Validation
├── [email protected] = 0.2031
│ ├── Best until now = 0.3248 (↘ -0.1217)
│ └── Epoch N-1 = 0.2527 (↘ -0.0496)
├── [email protected] = 0.3472
│ ├── Best until now = 0.6509 (↘ -0.3036)
│ └── Epoch N-1 = 0.6063 (↘ -0.2591)
├── Ppyoloeloss/loss = 1.9031
│ ├── Best until now = 1.4454 (↗ 0.4576)
│ └── Epoch N-1 = 1.512 (↗ 0.391)
├── Ppyoloeloss/loss_cls = 1.047
│ ├── Best until now = 0.7288 (↗ 0.3183)
│ └── Epoch N-1 = 0.7584 (↗ 0.2887)
├── Ppyoloeloss/loss_dfl = 0.8942
│ ├── Best until now = 0.7688 (↗ 0.1254)
│ └── Epoch N-1 = 0.7988 (↗ 0.0954)
├── Ppyoloeloss/loss_iou = 0.1636
│ ├── Best until now = 0.1329 (↗ 0.0307)
│ └── Epoch N-1 = 0.1417 (↗ 0.0219)
├── [email protected] = 0.1469
│ ├── Best until now = 0.2307 (↘ -0.0838)
│ └── Epoch N-1 = 0.1782 (↘ -0.0313)
└── [email protected] = 0.5374
├── Best until now = 0.8663 (↘ -0.3289)
└── Epoch N-1 = 0.7783 (↘ -0.2408)
===========================================================
Train epoch 3: 100%|██████████| 1/1 [00:01<00:00, 1.75s/it, PPYoloELoss/loss=2.17, PPYoloELoss/loss_cls=1.08, PPYoloELoss/loss_dfl=1.02, PPYoloELoss/loss_iou=0.232, gpu_mem=8.76]
Validation epoch 3: 100%|██████████| 1/1 [00:00<00:00, 2.29it/s]
===========================================================
SUMMARY OF EPOCH 3
├── Training
│ ├── Ppyoloeloss/loss = 2.1715
│ │ ├── Best until now = 1.9485 (↗ 0.223)
│ │ └── Epoch N-1 = 1.9485 (↗ 0.223)
│ ├── Ppyoloeloss/loss_cls = 1.0823
│ │ ├── Best until now = 0.9977 (↗ 0.0846)
│ │ └── Epoch N-1 = 0.9977 (↗ 0.0846)
│ ├── Ppyoloeloss/loss_dfl = 1.0169
│ │ ├── Best until now = 0.9206 (↗ 0.0963)
│ │ └── Epoch N-1 = 0.9206 (↗ 0.0963)
│ └── Ppyoloeloss/loss_iou = 0.2323
│ ├── Best until now = 0.1962 (↗ 0.0361)
│ └── Epoch N-1 = 0.1962 (↗ 0.0361)
└── Validation
├── [email protected] = 0.0893
│ ├── Best until now = 0.3248 (↘ -0.2355)
│ └── Epoch N-1 = 0.2031 (↘ -0.1138)
├── [email protected] = 0.1076
│ ├── Best until now = 0.6509 (↘ -0.5433)
│ └── Epoch N-1 = 0.3472 (↘ -0.2396)
├── Ppyoloeloss/loss = 2.6456
│ ├── Best until now = 1.4454 (↗ 1.2002)
│ └── Epoch N-1 = 1.9031 (↗ 0.7425)
├── Ppyoloeloss/loss_cls = 1.5227
│ ├── Best until now = 0.7288 (↗ 0.794)
│ └── Epoch N-1 = 1.047 (↗ 0.4757)
├── Ppyoloeloss/loss_dfl = 1.1418
│ ├── Best until now = 0.7688 (↗ 0.373)
│ └── Epoch N-1 = 0.8942 (↗ 0.2475)
├── Ppyoloeloss/loss_iou = 0.2208
│ ├── Best until now = 0.1329 (↗ 0.0879)
│ └── Epoch N-1 = 0.1636 (↗ 0.0572)
├── [email protected] = 0.1193
│ ├── Best until now = 0.2307 (↘ -0.1114)
│ └── Epoch N-1 = 0.1469 (↘ -0.0276)
└── [email protected] = 0.1665
├── Best until now = 0.8663 (↘ -0.6998)
└── Epoch N-1 = 0.5374 (↘ -0.3709)
===========================================================
Train epoch 4: 100%|██████████| 1/1 [00:01<00:00, 1.74s/it, PPYoloELoss/loss=2.27, PPYoloELoss/loss_cls=1.21, PPYoloELoss/loss_dfl=1.02, PPYoloELoss/loss_iou=0.221, gpu_mem=8.69]
Validation epoch 4: 100%|██████████| 1/1 [00:00<00:00, 1.87it/s]
===========================================================
SUMMARY OF EPOCH 4
├── Training
│ ├── Ppyoloeloss/loss = 2.2669
│ │ ├── Best until now = 1.9485 (↗ 0.3184)
│ │ └── Epoch N-1 = 2.1715 (↗ 0.0954)
│ ├── Ppyoloeloss/loss_cls = 1.2055
│ │ ├── Best until now = 0.9977 (↗ 0.2078)
│ │ └── Epoch N-1 = 1.0823 (↗ 0.1232)
│ ├── Ppyoloeloss/loss_dfl = 1.018
│ │ ├── Best until now = 0.9206 (↗ 0.0973)
│ │ └── Epoch N-1 = 1.0169 (↗ 0.0011)
│ └── Ppyoloeloss/loss_iou = 0.221
│ ├── Best until now = 0.1962 (↗ 0.0248)
│ └── Epoch N-1 = 0.2323 (↘ -0.0113)
└── Validation
├── [email protected] = 0.0463
│ ├── Best until now = 0.3248 (↘ -0.2785)
│ └── Epoch N-1 = 0.0893 (↘ -0.0429)
├── [email protected] = 0.0511
│ ├── Best until now = 0.6509 (↘ -0.5998)
│ └── Epoch N-1 = 0.1076 (↘ -0.0565)
├── Ppyoloeloss/loss = 3.1681
│ ├── Best until now = 1.4454 (↗ 1.7227)
│ └── Epoch N-1 = 2.6456 (↗ 0.5225)
├── Ppyoloeloss/loss_cls = 1.8033
│ ├── Best until now = 0.7288 (↗ 1.0745)
│ └── Epoch N-1 = 1.5227 (↗ 0.2806)
├── Ppyoloeloss/loss_dfl = 1.3731
│ ├── Best until now = 0.7688 (↗ 0.6043)
│ └── Epoch N-1 = 1.1418 (↗ 0.2313)
├── Ppyoloeloss/loss_iou = 0.2713
│ ├── Best until now = 0.1329 (↗ 0.1384)
│ └── Epoch N-1 = 0.2208 (↗ 0.0505)
├── [email protected] = 0.0392
│ ├── Best until now = 0.2307 (↘ -0.1915)
│ └── Epoch N-1 = 0.1193 (↘ -0.0801)
└── [email protected] = 0.102
├── Best until now = 0.8663 (↘ -0.7643)
└── Epoch N-1 = 0.1665 (↘ -0.0645)
===========================================================
Train epoch 5: 100%|██████████| 1/1 [00:01<00:00, 1.75s/it, PPYoloELoss/loss=2.19, PPYoloELoss/loss_cls=1.09, PPYoloELoss/loss_dfl=1.08, PPYoloELoss/loss_iou=0.223, gpu_mem=8.73]
Validation epoch 5: 100%|██████████| 1/1 [00:00<00:00, 2.73it/s]
===========================================================
SUMMARY OF EPOCH 5
├── Training
│ ├── Ppyoloeloss/loss = 2.1937
│ │ ├── Best until now = 1.9485 (↗ 0.2452)
│ │ └── Epoch N-1 = 2.2669 (↘ -0.0732)
│ ├── Ppyoloeloss/loss_cls = 1.0936
│ │ ├── Best until now = 0.9977 (↗ 0.096)
│ │ └── Epoch N-1 = 1.2055 (↘ -0.1119)
│ ├── Ppyoloeloss/loss_dfl = 1.0847
│ │ ├── Best until now = 0.9206 (↗ 0.1641)
│ │ └── Epoch N-1 = 1.018 (↗ 0.0667)
│ └── Ppyoloeloss/loss_iou = 0.2231
│ ├── Best until now = 0.1962 (↗ 0.0269)
│ └── Epoch N-1 = 0.221 (↗ 0.0021)
└── Validation
├── [email protected] = 0.0153
│ ├── Best until now = 0.3248 (↘ -0.3095)
│ └── Epoch N-1 = 0.0463 (↘ -0.0311)
├── [email protected] = 0.0094
│ ├── Best until now = 0.6509 (↘ -0.6415)
│ └── Epoch N-1 = 0.0511 (↘ -0.0417)
├── Ppyoloeloss/loss = 3.423
│ ├── Best until now = 1.4454 (↗ 1.9776)
│ └── Epoch N-1 = 3.1681 (↗ 0.2549)
├── Ppyoloeloss/loss_cls = 1.9421
│ ├── Best until now = 0.7288 (↗ 1.2134)
│ └── Epoch N-1 = 1.8033 (↗ 0.1388)
├── Ppyoloeloss/loss_dfl = 1.4724
│ ├── Best until now = 0.7688 (↗ 0.7036)
│ └── Epoch N-1 = 1.3731 (↗ 0.0993)
├── Ppyoloeloss/loss_iou = 0.2979
│ ├── Best until now = 0.1329 (↗ 0.165)
│ └── Epoch N-1 = 0.2713 (↗ 0.0266)
├── [email protected] = 0.0726
│ ├── Best until now = 0.2307 (↘ -0.1581)
│ └── Epoch N-1 = 0.0392 (↗ 0.0334)
└── [email protected] = 0.0271
├── Best until now = 0.8663 (↘ -0.8393)
└── Epoch N-1 = 0.102 (↘ -0.075)
===========================================================
Train epoch 6: 100%|██████████| 1/1 [00:01<00:00, 1.78s/it, PPYoloELoss/loss=2.34, PPYoloELoss/loss_cls=1.19, PPYoloELoss/loss_dfl=1.14, PPYoloELoss/loss_iou=0.229, gpu_mem=8.88]
Validation epoch 6: 100%|██████████| 1/1 [00:00<00:00, 2.76it/s]
===========================================================
SUMMARY OF EPOCH 6
├── Training
│ ├── Ppyoloeloss/loss = 2.3372
│ │ ├── Best until now = 1.9485 (↗ 0.3887)
│ │ └── Epoch N-1 = 2.1937 (↗ 0.1435)
│ ├── Ppyoloeloss/loss_cls = 1.1938
│ │ ├── Best until now = 0.9977 (↗ 0.1961)
│ │ └── Epoch N-1 = 1.0936 (↗ 0.1002)
│ ├── Ppyoloeloss/loss_dfl = 1.142
│ │ ├── Best until now = 0.9206 (↗ 0.2214)
│ │ └── Epoch N-1 = 1.0847 (↗ 0.0573)
│ └── Ppyoloeloss/loss_iou = 0.229
│ ├── Best until now = 0.1962 (↗ 0.0328)
│ └── Epoch N-1 = 0.2231 (↗ 0.0059)
└── Validation
├── [email protected] = 0.0077
│ ├── Best until now = 0.3248 (↘ -0.3171)
│ └── Epoch N-1 = 0.0153 (↘ -0.0076)
├── [email protected] = 0.0119
│ ├── Best until now = 0.6509 (↘ -0.639)
│ └── Epoch N-1 = 0.0094 (↗ 0.0025)
├── Ppyoloeloss/loss = 3.5236
│ ├── Best until now = 1.4454 (↗ 2.0782)
│ └── Epoch N-1 = 3.423 (↗ 0.1006)
├── Ppyoloeloss/loss_cls = 1.9936
│ ├── Best until now = 0.7288 (↗ 1.2648)
│ └── Epoch N-1 = 1.9421 (↗ 0.0515)
├── Ppyoloeloss/loss_dfl = 1.4912
│ ├── Best until now = 0.7688 (↗ 0.7224)
│ └── Epoch N-1 = 1.4724 (↗ 0.0188)
├── Ppyoloeloss/loss_iou = 0.3138
│ ├── Best until now = 0.1329 (↗ 0.1809)
│ └── Epoch N-1 = 0.2979 (↗ 0.0159)
├── [email protected] = 0.0158
│ ├── Best until now = 0.2307 (↘ -0.2149)
│ └── Epoch N-1 = 0.0726 (↘ -0.0568)
└── [email protected] = 0.0236
├── Best until now = 0.8663 (↘ -0.8428)
└── Epoch N-1 = 0.0271 (↘ -0.0035)
===========================================================
Train epoch 7: 100%|██████████| 1/1 [00:01<00:00, 1.66s/it, PPYoloELoss/loss=2.26, PPYoloELoss/loss_cls=1.12, PPYoloELoss/loss_dfl=1.08, PPYoloELoss/loss_iou=0.237, gpu_mem=8.97]
Validation epoch 7: 100%|██████████| 1/1 [00:00<00:00, 3.11it/s]
===========================================================
SUMMARY OF EPOCH 7
├── Training
│ ├── Ppyoloeloss/loss = 2.2562
│ │ ├── Best until now = 1.9485 (↗ 0.3077)
│ │ └── Epoch N-1 = 2.3372 (↘ -0.081)
│ ├── Ppyoloeloss/loss_cls = 1.1237
│ │ ├── Best until now = 0.9977 (↗ 0.126)
│ │ └── Epoch N-1 = 1.1938 (↘ -0.0701)
│ ├── Ppyoloeloss/loss_dfl = 1.0786
│ │ ├── Best until now = 0.9206 (↗ 0.158)
│ │ └── Epoch N-1 = 1.142 (↘ -0.0634)
│ └── Ppyoloeloss/loss_iou = 0.2373
│ ├── Best until now = 0.1962 (↗ 0.0411)
│ └── Epoch N-1 = 0.229 (↗ 0.0083)
└── Validation
├── [email protected] = 0.0062
│ ├── Best until now = 0.3248 (↘ -0.3186)
│ └── Epoch N-1 = 0.0077 (↘ -0.0015)
├── [email protected] = 0.0053
│ ├── Best until now = 0.6509 (↘ -0.6456)
│ └── Epoch N-1 = 0.0119 (↘ -0.0066)
├── Ppyoloeloss/loss = 3.5213
│ ├── Best until now = 1.4454 (↗ 2.0759)
│ └── Epoch N-1 = 3.5236 (↘ -0.0023)
├── Ppyoloeloss/loss_cls = 1.985
│ ├── Best until now = 0.7288 (↗ 1.2563)
│ └── Epoch N-1 = 1.9936 (↘ -0.0086)
├── Ppyoloeloss/loss_dfl = 1.4429
│ ├── Best until now = 0.7688 (↗ 0.6741)
│ └── Epoch N-1 = 1.4912 (↘ -0.0483)
├── Ppyoloeloss/loss_iou = 0.326
│ ├── Best until now = 0.1329 (↗ 0.193)
│ └── Epoch N-1 = 0.3138 (↗ 0.0122)
├── [email protected] = 0.0144
│ ├── Best until now = 0.2307 (↘ -0.2162)
│ └── Epoch N-1 = 0.0158 (↘ -0.0014)
└── [email protected] = 0.0171
├── Best until now = 0.8663 (↘ -0.8492)
└── Epoch N-1 = 0.0236 (↘ -0.0064)
===========================================================
Train epoch 8: 100%|██████████| 1/1 [00:01<00:00, 1.70s/it, PPYoloELoss/loss=2.02, PPYoloELoss/loss_cls=1.01, PPYoloELoss/loss_dfl=1.01, PPYoloELoss/loss_iou=0.201, gpu_mem=8.76]
Validation epoch 8: 100%|██████████| 1/1 [00:00<00:00, 2.88it/s]
===========================================================
SUMMARY OF EPOCH 8
├── Training
│ ├── Ppyoloeloss/loss = 2.0161
│ │ ├── Best until now = 1.9485 (↗ 0.0677)
│ │ └── Epoch N-1 = 2.2562 (↘ -0.24)
│ ├── Ppyoloeloss/loss_cls = 1.0092
│ │ ├── Best until now = 0.9977 (↗ 0.0115)
│ │ └── Epoch N-1 = 1.1237 (↘ -0.1145)
│ ├── Ppyoloeloss/loss_dfl = 1.0092
│ │ ├── Best until now = 0.9206 (↗ 0.0886)
│ │ └── Epoch N-1 = 1.0786 (↘ -0.0694)
│ └── Ppyoloeloss/loss_iou = 0.2009
│ ├── Best until now = 0.1962 (↗ 0.0047)
│ └── Epoch N-1 = 0.2373 (↘ -0.0363)
└── Validation
├── [email protected] = 0.0107
│ ├── Best until now = 0.3248 (↘ -0.3141)
│ └── Epoch N-1 = 0.0062 (↗ 0.0045)
├── [email protected] = 0.006
│ ├── Best until now = 0.6509 (↘ -0.6449)
│ └── Epoch N-1 = 0.0053 (↗ 0.0007)
├── Ppyoloeloss/loss = 3.6864
│ ├── Best until now = 1.4454 (↗ 2.241)
│ └── Epoch N-1 = 3.5213 (↗ 0.1651)
├── Ppyoloeloss/loss_cls = 2.1107
│ ├── Best until now = 0.7288 (↗ 1.382)
│ └── Epoch N-1 = 1.985 (↗ 0.1257)
├── Ppyoloeloss/loss_dfl = 1.4626
│ ├── Best until now = 0.7688 (↗ 0.6938)
│ └── Epoch N-1 = 1.4429 (↗ 0.0198)
├── Ppyoloeloss/loss_iou = 0.3377
│ ├── Best until now = 0.1329 (↗ 0.2048)
│ └── Epoch N-1 = 0.326 (↗ 0.0118)
├── [email protected] = 0.0087
│ ├── Best until now = 0.2307 (↘ -0.222)
│ └── Epoch N-1 = 0.0144 (↘ -0.0057)
└── [email protected] = 0.0171
├── Best until now = 0.8663 (↘ -0.8492)
└── Epoch N-1 = 0.0171 (= 0.0)
===========================================================
Train epoch 9: 100%|██████████| 1/1 [00:01<00:00, 1.77s/it, PPYoloELoss/loss=2.06, PPYoloELoss/loss_cls=1.01, PPYoloELoss/loss_dfl=1.05, PPYoloELoss/loss_iou=0.213, gpu_mem=8.69]
Validation epoch 9: 100%|██████████| 1/1 [00:00<00:00, 2.76it/s]
===========================================================
SUMMARY OF EPOCH 9
├── Training
│ ├── Ppyoloeloss/loss = 2.0629
│ │ ├── Best until now = 1.9485 (↗ 0.1144)
│ │ └── Epoch N-1 = 2.0161 (↗ 0.0468)
│ ├── Ppyoloeloss/loss_cls = 1.0065
│ │ ├── Best until now = 0.9977 (↗ 0.0088)
│ │ └── Epoch N-1 = 1.0092 (↘ -0.0027)
│ ├── Ppyoloeloss/loss_dfl = 1.05
│ │ ├── Best until now = 0.9206 (↗ 0.1294)
│ │ └── Epoch N-1 = 1.0092 (↗ 0.0408)
│ └── Ppyoloeloss/loss_iou = 0.2126
│ ├── Best until now = 0.1962 (↗ 0.0164)
│ └── Epoch N-1 = 0.2009 (↗ 0.0116)
└── Validation
├── [email protected] = 0.0106
│ ├── Best until now = 0.3248 (↘ -0.3142)
│ └── Epoch N-1 = 0.0107 (↘ -0.0)
├── [email protected] = 0.006
│ ├── Best until now = 0.6509 (↘ -0.6449)
│ └── Epoch N-1 = 0.006 (↘ -0.0)
├── Ppyoloeloss/loss = 3.8045
│ ├── Best until now = 1.4454 (↗ 2.3591)
│ └── Epoch N-1 = 3.6864 (↗ 0.1181)
├── Ppyoloeloss/loss_cls = 2.2185
│ ├── Best until now = 0.7288 (↗ 1.4898)
│ └── Epoch N-1 = 2.1107 (↗ 0.1078)
├── Ppyoloeloss/loss_dfl = 1.5018
│ ├── Best until now = 0.7688 (↗ 0.733)
│ └── Epoch N-1 = 1.4626 (↗ 0.0392)
├── Ppyoloeloss/loss_iou = 0.334
│ ├── Best until now = 0.1329 (↗ 0.2011)
│ └── Epoch N-1 = 0.3377 (↘ -0.0037)
├── [email protected] = 0.0253
│ ├── Best until now = 0.2307 (↘ -0.2053)
│ └── Epoch N-1 = 0.0087 (↗ 0.0166)
└── [email protected] = 0.0109
├── Best until now = 0.8663 (↘ -0.8554)
└── Epoch N-1 = 0.0171 (↘ -0.0062)
===========================================================
[2023-08-10 14:52:58] INFO - sg_trainer.py - RUNNING ADDITIONAL TEST ON THE AVERAGED MODEL...
Validation epoch 10: 100%|██████████| 1/1 [00:00<00:00, 2.70it/s]
===========================================================
SUMMARY OF EPOCH 10
├── Training
│ ├── Ppyoloeloss/loss = 2.0629
│ │ ├── Best until now = 1.9485 (↗ 0.1144)
│ │ └── Epoch N-1 = 2.0161 (↗ 0.0468)
│ ├── Ppyoloeloss/loss_cls = 1.0065
│ │ ├── Best until now = 0.9977 (↗ 0.0088)
│ │ └── Epoch N-1 = 1.0092 (↘ -0.0027)
│ ├── Ppyoloeloss/loss_dfl = 1.05
│ │ ├── Best until now = 0.9206 (↗ 0.1294)
│ │ └── Epoch N-1 = 1.0092 (↗ 0.0408)
│ └── Ppyoloeloss/loss_iou = 0.2126
│ ├── Best until now = 0.1962 (↗ 0.0164)
│ └── Epoch N-1 = 0.2009 (↗ 0.0116)
└── Validation
├── [email protected] = 0.0568
│ ├── Best until now = 0.3248 (↘ -0.268)
│ └── Epoch N-1 = 0.0106 (↗ 0.0462)
├── [email protected] = 0.0732
│ ├── Best until now = 0.6509 (↘ -0.5776)
│ └── Epoch N-1 = 0.006 (↗ 0.0673)
├── Ppyoloeloss/loss = 3.0419
│ ├── Best until now = 1.4454 (↗ 1.5965)
│ └── Epoch N-1 = 3.8045 (↘ -0.7626)
├── Ppyoloeloss/loss_cls = 1.8235
│ ├── Best until now = 0.7288 (↗ 1.0947)
│ └── Epoch N-1 = 2.2185 (↘ -0.395)
├── Ppyoloeloss/loss_dfl = 1.2336
│ ├── Best until now = 0.7688 (↗ 0.4648)
│ └── Epoch N-1 = 1.5018 (↘ -0.2683)
├── Ppyoloeloss/loss_iou = 0.2407
│ ├── Best until now = 0.1329 (↗ 0.1078)
│ └── Epoch N-1 = 0.334 (↘ -0.0934)
├── [email protected] = 0.1192
│ ├── Best until now = 0.2307 (↘ -0.1115)
│ └── Epoch N-1 = 0.0253 (↗ 0.0938)
└── [email protected] = 0.1023
├── Best until now = 0.8663 (↘ -0.764)
└── Epoch N-1 = 0.0109 (↗ 0.0915)
===========================================================
[2023-08-10 14:52:59] INFO - base_sg_logger.py - [CLEANUP] - Successfully stopped system monitoring process
Process finished with exit code 0