TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

TensorRT runs much slower for non-continuous inference

Open Apeiria opened this issue 2 years ago • 9 comments

Description

When inserting sleep(); between inference, the inference time becomes much longer

    for (int i = 0; i < 10000; i++)
    {
        auto t1 = high_resolution_clock::now();
        context->executeV2(bindings);
        // context->enqueueV2(bindings, stream, nullptr);
        // cudaStreamSynchronize(stream);
        auto t2 = high_resolution_clock::now();
        cout<<"iter " << i << " time = " << duration_cast<microseconds>(t2 - t1).count() / 1000.0 << " ms" << endl;
        _sleep(1000);
    }

image

If the _sleep(1000); is removed, the inference time will be very stable, roughly 2ms, no matter how long it runs.

Environment

TensorRT Version: 8.4.1.5 NVIDIA GPU: RTX 2070 super NVIDIA Driver Version: 516.01 CUDA Version: 11.7 CUDNN Version: 8.4.1.50 Operating System: windows 11 Python Version (if applicable): Tensorflow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if so, version):

Relevant Files

Steps To Reproduce

Load any tensorrt model, compare the following for loop with/without sleep();

for(...)
{
    t1 = clock();
    inference();
    t2 = clock();
    print(t2-t1);
    sleep();
}

Apeiria avatar Jun 30 '22 02:06 Apeiria

This is usually caused by GPUs going to idle and not an issue in TRT. Could you check the solution in https://github.com/NVIDIA/TensorRT/issues/2042 and see if it works?

nvpohanh avatar Jun 30 '22 02:06 nvpohanh

This is usually caused by GPUs going to idle and not an issue in TRT. Could you check the solution in #2042 and see if it works?

Thanks for your reply, it works! However are there anything I can do in the c++ code to prevent the GPU from going to sleep when idle?

Apeiria avatar Jun 30 '22 03:06 Apeiria

No, it's mostly caused by the GPU driver, so you need to change the driver settings.

nvpohanh avatar Jun 30 '22 03:06 nvpohanh

No, it's mostly caused by the GPU driver, so you need to change the driver settings.

OK, thanks a lot for your help!

Apeiria avatar Jun 30 '22 03:06 Apeiria

No, it's mostly caused by the GPU driver, so you need to change the driver settings.

Hi again, I just did the same experiment in pytorch, and I found the inference time is quite stable regardless of how long I make it sleep. Do you have any idea why the behavior is different for pytorch and tensorrt?

    for j in range(10000):
        with torch.inference_mode():
            t1 = time.time()
            pred = model(img)
            print("iter", j, "time =", (time.time()-t1)*1000,"ms")
        time.sleep(1)

image

Thanks!

Apeiria avatar Jun 30 '22 05:06 Apeiria

@Apeiria Is the PyTorch using GPU or CPU? Also, could you try time.sleep(5) and see if it makes any difference?

nvpohanh avatar Jun 30 '22 10:06 nvpohanh

@Apeiria Is the PyTorch using GPU or CPU? Also, could you try time.sleep(5) and see if it makes any difference?

My pytorch is using gpu. I tried time.sleep(5) and it seems the same, below is the result. image

Apeiria avatar Jun 30 '22 11:06 Apeiria

interesting... we will take a look when having the chance. Meanwhile, I still recommend that you try the approaches in https://github.com/NVIDIA/TensorRT/issues/2042 to see if any of them fixes the issue.

nvpohanh avatar Jul 01 '22 02:07 nvpohanh

interesting... we will take a look when having the chance. Meanwhile, I still recommend that you try the approaches in #2042 to see if any of them fixes the issue.

Thanks a lot, I tried those approaches and nvidia-smi -lgc worked to some extent. After locking the frequency to maximum, the inference time still increased compared to the version without sleep, but less dramatically. (before locking, inference time: 2ms -> 12ms; after: 2ms -> 4ms). however I'm only able to change the driver settings of my computer, not those the model is going to be deployed to, so unfortunately those approaches wouldn't solve my problem (T_T).

Apeiria avatar Jul 01 '22 10:07 Apeiria

I fix this question, but it cost lot of time

so i need tensorrt time give me reward ,to announce the solution

mailbox : [email protected]

haibozhang123 avatar Jan 06 '23 05:01 haibozhang123