pytorch-ssd icon indicating copy to clipboard operation
pytorch-ssd copied to clipboard

image size

Open kingvision opened this issue 6 years ago • 6 comments
trafficstars

kingvision avatar Jun 23 '19 03:06 kingvision

if i train the model with image size=512, how i should modify the model?

kingvision avatar Jun 23 '19 03:06 kingvision

In issue #22 it is explained that this is not very straightforward to do so. After some research I found that you can do so by modifying the configuration file only, for vgg at least. My vgg_ssd_config.py file looks like:

import numpy as np
from vision.utils.box_utils import SSDSpec, SSDBoxSizes, generate_ssd_priors

image_size = 512
image_mean = np.array([123, 117, 104])  # RGB layout
image_std = 1.0

iou_threshold = 0.45
center_variance = 0.1
size_variance = 0.2

specs = [
    SSDSpec(64, 8, SSDBoxSizes(51, 102), [2]),
    SSDSpec(32, 16, SSDBoxSizes(102, 188), [2, 3]),
    SSDSpec(16, 32, SSDBoxSizes(188, 274), [2, 3]),
    SSDSpec(8, 64, SSDBoxSizes(274, 360), [2, 3]),
    SSDSpec(6, 124, SSDBoxSizes(360, 446), [2]),
    SSDSpec(4, 256, SSDBoxSizes(446, 532), [2])
]

priors = generate_ssd_priors(specs, image_size)

You can change SSDBoxSizes values if you feel like it, it just defines minimum and maximum box sizes for each feature map, as far as I understand. This enables you to train a ssd with vgg16 as backbone. I do not know if it works with other backbones, I haven't tested it yet but it should be ok. The real problem here is that the model becomes way heavier as the classification and regression headers are still the same. I think that you should modify them if you want to train a SSD512 without having that big a model.

Hope this helps.

Nicolas1203 avatar Aug 05 '19 13:08 Nicolas1203

In issue #22 it is explained that this is not very straightforward to do so. After some research I found that you can do so by modifying the configuration file only, for vgg at least. My vgg_ssd_config.py file looks like:

import numpy as np
from vision.utils.box_utils import SSDSpec, SSDBoxSizes, generate_ssd_priors

image_size = 512
image_mean = np.array([123, 117, 104])  # RGB layout
image_std = 1.0

iou_threshold = 0.45
center_variance = 0.1
size_variance = 0.2

specs = [
    SSDSpec(64, 8, SSDBoxSizes(51, 102), [2]),
    SSDSpec(32, 16, SSDBoxSizes(102, 188), [2, 3]),
    SSDSpec(16, 32, SSDBoxSizes(188, 274), [2, 3]),
    SSDSpec(8, 64, SSDBoxSizes(274, 360), [2, 3]),
    SSDSpec(6, 124, SSDBoxSizes(360, 446), [2]),
    SSDSpec(4, 256, SSDBoxSizes(446, 532), [2])
]

priors = generate_ssd_priors(specs, image_size)

You can change SSDBoxSizes values if you feel like it, it just defines minimum and maximum box sizes for each feature map, as far as I understand. This enables you to train a ssd with vgg16 as backbone. I do not know if it works with other backbones, I haven't tested it yet but it should be ok. The real problem here is that the model becomes way heavier as the classification and regression headers are still the same. I think that you should modify them if you want to train a SSD512 without having that big a model.

Hope this helps.

can we change image_size to different sizes for height and width separately

HongChow avatar Sep 04 '19 04:09 HongChow

import numpy as np from vision.utils.box_utils import SSDSpec, SSDBoxSizes, generate_ssd_priors image_size = 512 image_mean = np.array([123, 117, 104]) # RGB layout image_std = 1.0 iou_threshold = 0.45 center_variance = 0.1 size_variance = 0.2 specs = [ SSDSpec(64, 8, SSDBoxSizes(51, 102), [2]), SSDSpec(32, 16, SSDBoxSizes(102, 188), [2, 3]), SSDSpec(16, 32, SSDBoxSizes(188, 274), [2, 3]), SSDSpec(8, 64, SSDBoxSizes(274, 360), [2, 3]), SSDSpec(6, 124, SSDBoxSizes(360, 446), [2]), SSDSpec(4, 256, SSDBoxSizes(446, 532), [2]) ] priors = generate_ssd_priors(specs, image_size)

Do I only need to make the above changes in order to train SSD512? I thought there were 7 Features maps, do I need to add layers or something? I'd appreciate it if you could tell me.

KY319 avatar Sep 02 '21 14:09 KY319

same question here, Do I need to modify the layers in specs when changing input shape? If so, what is the "formula" to design SSD model based on input shape? I'd like to implement a more general solution that will adapt to perform well for an arbitrary input shape Thanks!

SarBH avatar Apr 25 '22 16:04 SarBH

raw_ssd_priors_movie0507171226

the default priors presents like this.

leaf918 avatar May 07 '22 09:05 leaf918