Saving yolov8 checkpoints based on staged recipe phase

Open corey-nm opened this issue 2 years ago • 0 comments

NOTE: this PR is targeting torchvision-phases branch, not main!

Similar to #1499, this updates checkpoint saving logic to save checkpoints with the phase the model is in appended to the name. For example here is the weights directory of the testing run used for this PR:

best_dense.pt
best_pruned.pt
best_pruned_quantized.pt
last_dense.pt
last_pruned.pt
last_pruned_quantized.pt
last.pt

Two changes to the yolov8 checkpoints:

The fitness saved in the file is specific to the phase now
The phase is also saved to the checkpoint

Test plan

Command:

sparseml.ultralytics.train --model yolov8s.pt --recipe <recipe below>

Recipe:

version: 1.1.0

training_modifiers:
  - !EpochRangeModifier
    start_epoch: 0
    end_epoch: 20

pruning_modifiers:
  - !GMPruningModifier
    init_sparsity: 0.05
    final_sparsity: 0.5
    start_epoch: 5.0
    end_epoch: 10.0
    update_frequency: 1.0
    params: [model.0.conv.weight]

quantization_modifiers:
  - !QuantizationModifier
    start_epoch: 15.0
    freeze_bn_stats_epoch: 15.25
    disable_quantization_observer_epoch: 15.5
    ignore: ["Upsample", "Concat", "SiLU"]

Here are the phases that are in the checkpoints at the end of training:

best_dense.pt epoch=4 phase= dense
last_dense.pt epoch= 4 phase= dense

best_pruned.pt epoch= 10 phase= pruned
last_pruned.pt epoch= 14 phase= pruned

best_pruned_quantized.pt epoch= 15 phase= pruned_quantized
last_pruned_quantized.pt epoch= -1 phase= pruned_quantized

last.pt epoch= -1 phase= pruned_quantized

Mar 31 '23 17:03 corey-nm