YAD2K icon indicating copy to clipboard operation
YAD2K copied to clipboard

Retraining Question (Mix of Pre-trained + new datasets)

Open PuneetKohli opened this issue 6 years ago • 3 comments

Hey guys,

I've got the retraining script to work on one of my custom datasets.

My requirements are as follows-

  • Use the pedestrians, traffic lights, and vehicles from the pre-trained COCO dataset
  • Train on a new dataset X
  • Train on another new dataset Y

Now, I have a few questions-

  1. How can I keep the pedestrians, etc from the pre-trained dataset when using 'retrain_yolo.py'?
  2. If I train yolo on dataset X, if I want to train it on dataset Y - do I need to create a new 'npz' with both X and Y? or will training it on X, followed by Y, keep both of them in the model?

Let me know if my question is still not clear.. Thank you!

PuneetKohli avatar Apr 09 '18 04:04 PuneetKohli

  1. retrain_yolo.py does not have the ability to exclude specific categories from training. You will need to modify it or create a custom dataset with those classes excluded. A good solution might be to create a custom data loading function that only loads data into memory a few pieces at a time, and only selects the categories that you want to use.

  2. retrain_yolo.py assumes the input model is the one with default pretrained weights from darknet. To save your progress and continue training on a new dataset, you will have to modify the script.

    1. edit: maybe you can get away with just changing the name of the saved model, and making sure it saves a full .h5 so you can recreate the model easily...

Sorry, I have not had too much time to work on expanding functionality, and the repo owner seems to be inactive for now @allanzelener

alecGraves avatar Apr 09 '18 18:04 alecGraves

Hi @shadySource Thanks for your reply. I don't think I am still clear based on your comments.

  1. So I would have to keep the original 80 classes from COCO, but I can additionally add new classes?
  2. If I understand correctly, I would need to create a new dataset with all of my classes in it for re-training?

PuneetKohli avatar Apr 09 '18 20:04 PuneetKohli

@PuneetKohli I am also trying to create a new dataset for it to train on, but I don't fully understand how retrain_yolo.py expects the training data. How do I feed my images and my annotation files into retrain_yolo ? Should the images and annotation xml files be in 2 different folders and should corresponding images and xml files have the same name? Also retrain_yolo.py expects a data_path , classes_path and anchors_path. How do these correspond to my images and my annotation xmls?

Nirvan101 avatar Jun 14 '18 08:06 Nirvan101