YOLO-World icon indicating copy to clipboard operation
YOLO-World copied to clipboard

bug fix of multi batch size and support onnx of yolo-s model

Open wufei-png opened this issue 9 months ago • 0 comments

fix: txt_feats should repeat batch_size times: when export onnx model with batch size > 1 the img_feats shape is:

img_feats torch.Size([batch_size, 128, 80, 80])
img_feats torch.Size([batch_size, 256, 40, 40])
img_feats torch.Size([batch_size, 512, 20, 20])

however,the txt_feats is always : torch.Size([1, 80, 512]) it should be torch.Size([batch_size, 80, 512]) to avoid the export error like this:

File "/home/wufei2/anaconda3/envs/train10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/wufei2/anaconda3/envs/train10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/wufei2/go/src/github.com/AILab-CVC/YOLO-World/deploy/../yolo_world/models/layers/yolo_bricks.py", line 289, in forward
    return x * self.scale + text_features
RuntimeError: The size of tensor a (40) must match the size of tensor b (80) at non-singleton dimension 1

chore: support onnx export: I use YOLO-World-S model to export onnx format,have the err same as this issue: I referenced the solution in this issue to add the max and avgpool that is compatible with onnx exports

Now export onnx has two compatible problems: einsum and pool layer, and I don't want to add a new bool variable (because other new compatibility problems may occur in future) so I replaced bool use_einsum with export_onnx.

The code has been tested in my local env.

-------------comment at 5.18: @wondervictor I notice that this pr have a small conflict about code formatting after recent main branch's code update, which has now been modified and is ready to be merged in: image

wufei-png avatar May 04 '24 06:05 wufei-png