YAD2K
YAD2K copied to clipboard
Retraining Question (Mix of Pre-trained + new datasets)
Hey guys,
I've got the retraining script to work on one of my custom datasets.
My requirements are as follows-
- Use the pedestrians, traffic lights, and vehicles from the pre-trained COCO dataset
- Train on a new dataset X
- Train on another new dataset Y
Now, I have a few questions-
- How can I keep the pedestrians, etc from the pre-trained dataset when using 'retrain_yolo.py'?
- If I train yolo on dataset X, if I want to train it on dataset Y - do I need to create a new 'npz' with both X and Y? or will training it on X, followed by Y, keep both of them in the model?
Let me know if my question is still not clear.. Thank you!
-
retrain_yolo.py
does not have the ability to exclude specific categories from training. You will need to modify it or create a custom dataset with those classes excluded. A good solution might be to create a custom data loading function that only loads data into memory a few pieces at a time, and only selects the categories that you want to use. -
retrain_yolo.py
assumes the input model is the one with default pretrained weights from darknet. To save your progress and continue training on a new dataset, you will have to modify the script.- edit: maybe you can get away with just changing the name of the saved model, and making sure it saves a full .h5 so you can recreate the model easily...
Sorry, I have not had too much time to work on expanding functionality, and the repo owner seems to be inactive for now @allanzelener
Hi @shadySource Thanks for your reply. I don't think I am still clear based on your comments.
- So I would have to keep the original 80 classes from COCO, but I can additionally add new classes?
- If I understand correctly, I would need to create a new dataset with all of my classes in it for re-training?
@PuneetKohli I am also trying to create a new dataset for it to train on, but I don't fully understand how retrain_yolo.py
expects the training data. How do I feed my images and my annotation files into retrain_yolo ? Should the images and annotation xml files be in 2 different folders and should corresponding images and xml files have the same name? Also retrain_yolo.py
expects a data_path
, classes_path
and anchors_path
. How do these correspond to my images and my annotation xmls?