Hi. great library. i have managed to run it on a nvidia Xavier NX with ~15FPS with 500 size images. Same "problem" of the others with also around 3G or ram consumed
i was wondering if its possible to use/add resnet 18/34? which should giveme better fps and smaller memory footprint
i have
-
downloaded pths from pytorchs
-
add backbones
-
add configs normal and edge
-
tried to train
-
failed
Here my steps
i have downloaded the pths for them from pytorch ( i saw that the resnet50 was of the same name of the pytorch models.. so i tryed!)
Added config (basically just copy paste resnet50 and changed path and args accordanlly)
resnet18_backbone = resnet101_backbone.copy({
'name': 'ResNet18',
'path': 'resnet18-5c106cde.pth',
'type': ResNetBackbone,
'args': ([2, 2, 2, 2],),
'transform': resnet_transform,
})
yolact_resnet18_config = yolact_base_config.copy({
'name': 'yolact_resnet18',
'backbone': resnet18_backbone.copy({
'selected_layers': list(range(1, 4)),
'pred_scales': yolact_base_config.backbone.pred_scales,
'pred_aspect_ratios': yolact_base_config.backbone.pred_aspect_ratios,
'use_pixel_scales': True,
'preapply_sqrt': False,
'use_square_anchors': True, # This is for backward compatability with a bug
}),
})
yolact_edge_resnet18_config = yolact_edge_config.copy({
'name': 'yolact_edge_resnet18',
'backbone': yolact_resnet18_config.backbone,
})
and them tried to train, but a lot of error for the layers with different input/ouput sizes occurred ( sorry i deleted the messages and now i'm training other stuff.. but latter i will update with corresponding messages
#edit!
here the error.. for resnet34
Traceback (most recent call last):
File "train.py", line 707, in
train(0, args=args)
File "train.py", line 256, in train
yolact_net.init_weights(backbone_path=args.save_folder + cfg.backbone.path)
File "/content/yolact_edge/yolact_edge/yolact.py", line 1269, in init_weights
self.backbone.init_backbone(backbone_path)
File "/content/yolact_edge/yolact_edge/backbone.py", line 145, in init_backbone
self.load_state_dict(state_dict, strict=False)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ResNetBackbone:
size mismatch for layers.0.0.conv1.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]).
size mismatch for layers.0.1.conv1.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 256, 1, 1]).
size mismatch for layers.0.2.conv1.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 256, 1, 1]).
size mismatch for layers.1.0.conv1.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 256, 1, 1]).
size mismatch for layers.1.0.downsample.0.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 256, 1, 1]).
size mismatch for layers.1.0.downsample.1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layers.1.0.downsample.1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layers.1.0.downsample.1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layers.1.0.downsample.1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layers.1.1.conv1.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 512, 1, 1]).
size mismatch for layers.1.2.conv1.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 512, 1, 1]).
size mismatch for layers.1.3.conv1.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 512, 1, 1]).
size mismatch for layers.2.0.conv1.weight: copying a param with shape torch.Size([256, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]).
size mismatch for layers.2.0.downsample.0.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 512, 1, 1]).
size mismatch for layers.2.0.downsample.1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for layers.2.0.downsample.1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for layers.2.0.downsample.1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for layers.2.0.downsample.1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for layers.2.1.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for layers.2.2.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for layers.2.3.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for layers.2.4.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for layers.2.5.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for layers.3.0.conv1.weight: copying a param with shape torch.Size([512, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 1]).
size mismatch for layers.3.0.downsample.0.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 1024, 1, 1]).
size mismatch for layers.3.0.downsample.1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for layers.3.0.downsample.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for layers.3.0.downsample.1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for layers.3.0.downsample.1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for layers.3.1.conv1.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 2048, 1, 1]).
size mismatch for layers.3.2.conv1.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 2048, 1, 1]).
hi, thanks for the info.
i have added the BasicBlock similar to the one on pytorch to the backbone.py file and updated the config file
when running resnet50/100 on 256/550 image size.. everything looks good.
but with resnet 34/18 i'm getting 0 results on testing/inference. but while training.. the evaluation and map are up to 80%.
Probably im doing something wrong.
In config for resnet18/34 i created the backbones
resnet34_backbone = resnet101_backbone.copy({
'name': 'ResNet34',
'path': 'resnet34-333f7ec4.pth',
'type': ResNetBackbone,
'args': ([3, 4, 6, 3],[],BasicBlock),
'transform': resnet_transform,
})
then created specific confings
this is the "default" for 101
yolact_edge_config = yolact_base_config.copy({
'name': 'yolact_edge',
####################
'torch2trt_max_calibration_images': 0,
#'torch2trt_backbone': False,
'torch2trt_backbone_int8': True,
'torch2trt_protonet_int8': True,
'torch2trt_fpn': True,
'torch2trt_prediction_module': True,
'use_fast_nms': False,
'dataset' : my_custom_dataset,
'num_classes':1+1,
# Image Size
'max_size': 256,
'min_size':200,
# Discard detections with width and height smaller than this (in absolute width and height)
'discard_box_width': 4 / 256,
'discard_box_height': 4 / 256,
# Training params
'lr_schedule': 'step',
'lr_steps': (4000, 6000, 8000, 9000),
'max_iter': 10000,
###################
})
then for example for 34
yolact_edge_resnet34_550_config = yolact_edge_config.copy({
'name': 'yolact_edge_resnet34_550',
'backbone': yolact_resnet34_config.backbone,
# Image Size
'max_size': 550,
'min_size':200,
# Discard detections with width and height smaller than this (in absolute width and height)
'discard_box_width': 4 / 550,
'discard_box_height': 4 / 550,
})
attached. the config file, a test file, and the updated backbone with the BasicBlock
example.zip
can you giveme a hint for where to look ?
You might want to confirm that the model runs well with no TensorRT enabled first.
I have trained the resnet18 and get the result, but when I try to eval the model with weights, here report the error:
"Backbone: ResNet18 is not currently supported with TenSorRT. "
can you teach me how to modify the code?