DALI icon indicating copy to clipboard operation
DALI copied to clipboard

How to realize letterbox function from YOLOv6 to DALI?

Open romanmaznikov1 opened this issue 2 years ago • 7 comments

Hi!

I am using DALI backend nvidia triton inference to preprocessing input images. I want to implement letterbox function in my python file serialize_model.py.

The letterbox function in the YOLOv6 pipeline looks like this:

def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
    # Resize and pad image while meeting stride-multiple constraints
    shape = im.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better val mAP)
        r = min(r, 1.0)

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh padding
    elif scaleFill:  # stretch
        dw, dh = 0.0, 0.0
        new_unpad = (new_shape[1], new_shape[0])
        ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratios

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    print(new_unpad)
    im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    
    return im, ratio, (dw, dh)

At the moment my preprocessing looks like this:

import nvidia.dali.types as types
import nvidia.dali as dali
import os

NEW_SHAPE = (640, 640)
COLOR = (114, 114, 114)
STRIDE = 32

@dali.pipeline_def(batch_size=8, num_threads=8, device_id=0)
def pipe():
    images = dali.fn.external_source(device="cpu", name="DALI_INPUT_0")
    images = images.gpu()

    #letterbox
    shape = dali.fn.shapes(images, dtype=types.INT64)
    height = shape[0]
    width  = shape[1]

    r = dali.math.min(NEW_SHAPE[0] / height, NEW_SHAPE[1] / width)

    new_unpad = dali.fn.cast(width * r, dtype=types.INT64),  dali.fn.cast(height * r, dtype=types.INT64)

    dw, dh = width - new_unpad[0], height - new_unpad[1] 

    dw = (dw - STRIDE * dali.fn.cast(dw // STRIDE - 0.5, dtype=types.INT64)) / 2
    dh = (dh - STRIDE * dali.fn.cast(dh // STRIDE - 0.5, dtype=types.INT64)) / 2

    images = dali.fn.resize(images, size = new_unpad, device = 'gpu')

    top = dali.fn.cast(dali.fn.cast(dh - 0.1, dtype=types.FLOAT), dtype=types.INT64)
    bottom = dali.fn.cast(dali.fn.cast(dh + 0.1, dtype=types.FLOAT), dtype=types.INT64)
    left = dali.fn.cast(dali.fn.cast(dw - 0.1, dtype=types.FLOAT), dtype=types.INT64)
    right = dali.fn.cast(dali.fn.cast(dw + 0.1, dtype=types.FLOAT), dtype=types.INT64)

    .....                        
    return images

version = '1'
os.makedirs(version, exist_ok=True)
pipe().serialize(filename=f'{version}/model.dali')

Question? How can i implement a function cv2.copyMakeBorder in DALI and does my code look like production?

romanmaznikov1 avatar Oct 13 '22 10:10 romanmaznikov1

Without using top, bottom, left, right after resizing I am trying to apply the approach with:

height_half_pad = types.Constant(
        value=(127, 127, 127),
        shape=[dh, width, 3],
        dtype=types.FLOAT,
        device="gpu"  # type: ignore
    )
width_half_pad = types.Constant(
        value=(127, 127, 127),
        shape=[height+dh, dw, 3],  # type: ignore
        dtype=types.FLOAT, # type:ignore
        device="gpu",
    )

images = dali.fn.cat(height_half_pad, images, dali.fn.copy(height_half_pad), axis=1)
images = dali.fn.cat(width_half_pad, images, dali.fn.copy(width_half_pad), axis=2)

but i get: TypeError: int() argument must be a string, a bytes-like object or a number, not 'DataNode' Can I somehow translate the dh and another value into an dtype=types.INT64?

romanmaznikov1 avatar Oct 14 '22 12:10 romanmaznikov1

Hi @romanmaznikov1 You can look at dali.fn.paste operator in the documentation. It performs a similar operation to cv.copyMakeBorder.

banasraf avatar Oct 17 '22 08:10 banasraf

@romanmaznikov1 Paste has very limited functionality. It's better to use crop or slice operators with a padding option (out_of_bounds_policy="pad").

mzient avatar Oct 17 '22 12:10 mzient

images = fn.resize(images, resize_x= 640, resize_y=352, mode = 'not _larger') images = fn.crop(images, crop = (352, 640), out_of_bounds_policy = 'pad', fill _values = 127)

umie0128 avatar Apr 27 '23 07:04 umie0128

Hi @romanmaznikov1 You can look at dali.fn.paste operator in the documentation. It performs a similar operation to cv.copyMakeBorder.

can you please provide sample code

snehashis1997 avatar Jul 30 '23 13:07 snehashis1997

@snehashis1997 - please check this answer. We don't support all border modes but pad should work this way.

JanuszL avatar Aug 02 '23 09:08 JanuszL

Thank you

snehashis1997 avatar Aug 02 '23 13:08 snehashis1997

Two-sided letterbox: images = fn.crop(images, crop=(640, 640), out_of_bounds_policy="pad", fill_values=114)

One-sided letterbox: images = fn.pad(images, shape=(640, 640, 3), fill_value=114)

michal-kierzynka avatar Mar 18 '24 14:03 michal-kierzynka