caltech-pedestrian-dataset-to-yolo-format-converter
caltech-pedestrian-dataset-to-yolo-format-converter copied to clipboard
converts the format of the caltech pedestrian dataset to the format that yolo uses
convert the format of the caltech pedestrian dataset to the format that yolo uses
This repo is adapted from
- https://github.com/mitmul/caltech-pedestrian-dataset-converter
- https://pjreddie.com/media/files/voc_label.py
dependencies
- opencv
- numpy
- scipy
how to
- Convert the
.seq
video files to.png
frames by running$ python generate-images.py
. They will end up in theimages
folder. - Squared images work better, which is why you can convert the 640x480 frames to 640x640 frames by running
$ python squarify-images.py
- Convert the
.vbb
annotation files to.txt
files by running$ python generate-annotation.py
. It will create thelabels
folder that contains the.txt
files named like the frames and thetrain.txt
andtest.txt
files that contain the paths to the images. - Adjust
.data
yolo file - Adjust
.cfg
yolo file: take e.g.yolo-voc.2.0.cfg
and setheight = 640
,width = 640
,classes = 2
, and in the final layerfilters = 35
(= (classes + 5) * 5)
)
folder structure
|- caltech
|-- annotations
|-- test06
|--- V000.seq
|--- ...
|-- ...
|-- train00
|-- ...
|- caltech-for-yolo (this repo, cd)
|-- generate-images.py
|-- generate-annotation.py
|-- images
|-- labels
|-- test.txt
|-- train.txt