pytorch-ssd
pytorch-ssd copied to clipboard
image size
if i train the model with image size=512, how i should modify the model?
In issue #22 it is explained that this is not very straightforward to do so. After some research I found that you can do so by modifying the configuration file only, for vgg at least. My vgg_ssd_config.py file looks like:
import numpy as np
from vision.utils.box_utils import SSDSpec, SSDBoxSizes, generate_ssd_priors
image_size = 512
image_mean = np.array([123, 117, 104]) # RGB layout
image_std = 1.0
iou_threshold = 0.45
center_variance = 0.1
size_variance = 0.2
specs = [
SSDSpec(64, 8, SSDBoxSizes(51, 102), [2]),
SSDSpec(32, 16, SSDBoxSizes(102, 188), [2, 3]),
SSDSpec(16, 32, SSDBoxSizes(188, 274), [2, 3]),
SSDSpec(8, 64, SSDBoxSizes(274, 360), [2, 3]),
SSDSpec(6, 124, SSDBoxSizes(360, 446), [2]),
SSDSpec(4, 256, SSDBoxSizes(446, 532), [2])
]
priors = generate_ssd_priors(specs, image_size)
You can change SSDBoxSizes values if you feel like it, it just defines minimum and maximum box sizes for each feature map, as far as I understand. This enables you to train a ssd with vgg16 as backbone. I do not know if it works with other backbones, I haven't tested it yet but it should be ok. The real problem here is that the model becomes way heavier as the classification and regression headers are still the same. I think that you should modify them if you want to train a SSD512 without having that big a model.
Hope this helps.
In issue #22 it is explained that this is not very straightforward to do so. After some research I found that you can do so by modifying the configuration file only, for vgg at least. My vgg_ssd_config.py file looks like:
import numpy as np from vision.utils.box_utils import SSDSpec, SSDBoxSizes, generate_ssd_priors image_size = 512 image_mean = np.array([123, 117, 104]) # RGB layout image_std = 1.0 iou_threshold = 0.45 center_variance = 0.1 size_variance = 0.2 specs = [ SSDSpec(64, 8, SSDBoxSizes(51, 102), [2]), SSDSpec(32, 16, SSDBoxSizes(102, 188), [2, 3]), SSDSpec(16, 32, SSDBoxSizes(188, 274), [2, 3]), SSDSpec(8, 64, SSDBoxSizes(274, 360), [2, 3]), SSDSpec(6, 124, SSDBoxSizes(360, 446), [2]), SSDSpec(4, 256, SSDBoxSizes(446, 532), [2]) ] priors = generate_ssd_priors(specs, image_size)You can change SSDBoxSizes values if you feel like it, it just defines minimum and maximum box sizes for each feature map, as far as I understand. This enables you to train a ssd with vgg16 as backbone. I do not know if it works with other backbones, I haven't tested it yet but it should be ok. The real problem here is that the model becomes way heavier as the classification and regression headers are still the same. I think that you should modify them if you want to train a SSD512 without having that big a model.
Hope this helps.
can we change image_size to different sizes for height and width separately
import numpy as np from vision.utils.box_utils import SSDSpec, SSDBoxSizes, generate_ssd_priors image_size = 512 image_mean = np.array([123, 117, 104]) # RGB layout image_std = 1.0 iou_threshold = 0.45 center_variance = 0.1 size_variance = 0.2 specs = [ SSDSpec(64, 8, SSDBoxSizes(51, 102), [2]), SSDSpec(32, 16, SSDBoxSizes(102, 188), [2, 3]), SSDSpec(16, 32, SSDBoxSizes(188, 274), [2, 3]), SSDSpec(8, 64, SSDBoxSizes(274, 360), [2, 3]), SSDSpec(6, 124, SSDBoxSizes(360, 446), [2]), SSDSpec(4, 256, SSDBoxSizes(446, 532), [2]) ] priors = generate_ssd_priors(specs, image_size)
Do I only need to make the above changes in order to train SSD512? I thought there were 7 Features maps, do I need to add layers or something? I'd appreciate it if you could tell me.
same question here, Do I need to modify the layers in specs when changing input shape?
If so, what is the "formula" to design SSD model based on input shape? I'd like to implement a more general solution that will adapt to perform well for an arbitrary input shape
Thanks!

the default priors presents like this.