super-gradients copied to clipboard
Add more labels to a custom trained model
💡 Your Question
I have a custom trained model model . It was trained using 13,000 images and labels and took 12 hours to train.
I want to add more training data (images and labels)
Is there a way to incrementally add a small amount of additional training data and re-run training without it taking 12 hours to complete?
No response
You can always load a weights of the trained model from previous step and continue training it from that state.
model = models.get(...., checkpoint_path=<ABSOLUTE_PATH_TO_CHECKPOINT_FROM_PREVIOUS_TRAINING>)
Or via cmd-line if you are using YAML recipes: python -m super_gradients.train_from_recipe --config-name=YOUR_RECIPES checkpoint_params.checkpoint_path=<ABSOLUTE_PATH_TO_CHECKPOINT_FROM_PREVIOUS_TRAINING>
Hi @BloodAxe
Thanks for the suggestion. Is this what you mean?
model = models.get("yolo_nas_l", num_classes=2, checkpoint_path=r"custommodel/ckpt_best.pth").cuda()
trainer.train(model=model, training_params=train_params, train_loader=train_data, valid_loader=val_data)
The re-trained custom model is not giving me the results I expect
The original custom model predicts correctly, i.e it identifies an object with 0.9 confidence.
However, when I run the same prediction on the re-trained custom model I don't get any prediction.
I wonder am I missing something from my training_params
` train_params = { # ENABLING SILENT MODE 'silent_mode': False, "average_best_models":True, "warmup_mode": "linear_epoch_step", "warmup_initial_lr": 1e-6, "lr_warmup_epochs": 3, "initial_lr": 5e-4, "lr_mode": "cosine", "cosine_final_lr_ratio": 0.1, "optimizer": "Adam", "optimizer_params": {"weight_decay": 0.0001}, "zero_weight_decay_on_bias_and_bn": True, "ema": True, "ema_params": {"decay": 0.9, "decay_type": "threshold"},
"max_epochs": EPOCHS,
"mixed_precision": True,
"loss": PPYoloELoss(
# NOTE: num_classes needs to be defined here
"valid_metrics_list": [
# NOTE: num_classes needs to be defined here
"metric_to_watch": '[email protected]'
The provided snippet is not enough to help. Please show the rest of code including a data loader's preparation (before and after you add more data) and tensorboard plots for regular training and with additional data
I'm using an AWS Sagemaker Training job to train the model.
Here is the code I use to create the Training Job
from sagemaker.estimator import Estimator
from sagemaker.pytorch import PyTorch
from sagemaker.session import TrainingInput
train_input = TrainingInput(dataset_s3_uri)
estimator = PyTorch( entry_point="", role=role, source_dir="./yolo-nas-model-scripts", instance_count=1, instance_type='ml.g4dn.12xlarge', framework_version="1.13.1", py_version="py39", sagemaker_session=sagemaker_session, input_mode='File', # FastFile causes a issue with writing label cache output_path=dataset_s3_uri+'/output', ), job_name=job_name) attached, renamed to train.txt train.txt
Is there a way to incrementally add a small amount of additional training data and re-run training without it taking 12 hours to complete?
My original question quoted above ^^
I have seen in the following discussion:
Currently, YOLOv8 does not have a feature for incremental learning
Is the same true of YOLO-NAS?
So what you are looking for is continual learning. A technique which allow to train a model on a few data samples without forgetting the existing knowledge.
Unfortunately at the moment we don't supoort this.