Nguyen K.
Nguyen K.
Hi @ltm920716, yes, tensorRT (RT) has speedup inference, cause it optimized the model for inference in specific GPU it built (then infer) on. You mean batch inference for torch model...
@ltm920716 , you test .pth on local or .pt (Torchscript) on Triton server (put in Model Repository)?
> > @ltm920716 , you test .pth on local or .pt (Torchscript) on Triton server (put in Model Repository)? > > @k9ele7en I test original torch model craft_mlt_25k.pth on torch==1.7.0...
> here is the results from the original test.py in craft git: > >  > torch.Size([1, 3, 1280, 736]) > time up?: 0.05096149444580078 > time up?: 0.04998612403869629 > time...
@ltm920716 no problems, it would be great when you share the results (both bad/good) of your experiment so that we can discuss and people can find useful informations and avoid...