dali_backend
dali_backend copied to clipboard
How to use scalar inputs
I'm trying to use a scalar input to resize a video, but can't figure out how to set the ndim parameter of external_source or the shape of the input in the client.
config.pbtxt
backend: "dali"
max_batch_size: 0
model_transaction_policy {
decoupled: True
}
1/dali.py
import nvidia.dali as dali
from nvidia.dali.plugin.triton import autoserialize #must include
@dali.plugin.triton.autoserialize
@dali.pipeline_def(batch_size=32, num_threads=4, device_id=0, output_dtype=dali.types.FLOAT, output_ndim=3)
def pipeline():
vid = dali.fn.experimental.inputs.video(name="INPUT", sequence_length=1, device='mixed')
height = dali.fn.external_source(name="HEIGHT", ndim=1, dtype=dali.types.INT16, repeat_last=True)
width = dali.fn.external_source(name="WIDTH", ndim=1, dtype=dali.types.INT16, repeat_last=True)
vid = dali.fn.resize(vid, resize_x=width, resize_y=height, mode="not_larger") #resize
vid = dali.fn.crop(vid, crop_w=width, crop_h=height, out_of_bounds_policy="pad") #pad
vid = dali.fn.squeeze(vid, axes=0) #remove sequence dim
vid = dali.fn.transpose(vid, perm=[2, 0, 1]) #HWC to CHW
vid = dali.fn.cast(vid, dtype=dali.types.FLOAT, name="OUTPUT") #UINT8 to FP32
return vid
client.py from video_decode_remap
...
width = np.ones((1), dtype=np.int16)*640
height = np.ones((1), dtype=np.int16)*360
inputs = [
tritonclient.grpc.InferInput("INPUT", video_raw.shape, "UINT8"),
tritonclient.grpc.InferInput("WIDTH", width.shape, "INT16"),
tritonclient.grpc.InferInput("HEIGHT", height.shape, "INT16"),
]
inputs[0].set_data_from_numpy(video_raw)
inputs[1].set_data_from_numpy(width)
inputs[2].set_data_from_numpy(height)
...
If I run that, I get unexpected shape for input 'HEIGHT' for model 'resize_224'. Expected [-1,-1], got [1]
. How do you properly set and get the scalar values in both client.py and dali.py?
Hey @wq9
I think this should work when you add the batch dimension to the height and width inputs. So, assuming the batch size is 32 in your pipeline, the client code would look like:
width = np.ones((32, 1), dtype=np.int16)*640
height = np.ones((32, 1), dtype=np.int16)*360
@banasraf Adding the batch dimension worked. Thanks!
However, when the input is a video (video_raw = np.expand_dims(np.fromfile(FLAGS.video, dtype=np.uint8), axis=0)
), the last batch is not 32, so I get the error:
[/opt/dali/dali/pipeline/operator/operator.cc:43] Assert on "curr_batch_size == static_cast<decltype(curr_batch_size)>(arg.second.tvec->num_samples())" failed:
ArgumentInput has to have the same batch size as an input.
Is there a way to pad the batch dimension?
@wq9
Unfortunately this operator does not allow padding of the last batch. I don't see any workaround that would make your case work properly. The only options I see is hardcoding the width and height in the pipeline or if you know the number of frames in the sample, predicting when to send a partial width and height tensors.
I'll add a task to our backlog to extend the video input operator with the option to pad the last batch.