Paddle icon indicating copy to clipboard operation
Paddle copied to clipboard

Need to stabilize Paddle-TRT unit tests

Open jeng1220 opened this issue 3 years ago • 1 comments

bug描述 Describe the Bug

Now many Paddle-TRT unit-tests include FP16 testing. However, the approach is to compare the results generated from Paddle-Inference with FP32 computation and the results generated from Paddle-TRT with FP16 computation. That's

Paddle-Inference with FP32 vs. Paddle-TRT with FP16

This is bad because the difference between the two results can be large (e.g. > 1e-5 or 1e-4 or 1e-3) and we don't have a good way to determine an acceptable tolerance. Only trial and error. We observed many failed unit tests were caused by this issue.

For instance, at https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/unittests/ir/inference/test_trt_pool_op.py#L61

The unit test - TensorRTPoolTest creates a FP32 program as ground truth.

class TensorRTPoolTest(InferencePassTest):
    def build_network(self):
        ...
        with fluid.program_guard(self.main_program, self.startup_program):
            data = fluid.data(name='data',
                              shape=[-1, self.channel, self.height, self.width],
                              dtype='float32')
            pool_out = fluid.layers.pool2d(input=data, ...)
            out = fluid.layers.batch_norm(pool_out, is_test=True)
            self.fetch_list = [out]

And then, at https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/unittests/ir/inference/test_trt_pool_op.py#L88

It setup AnalysisConfig.Precision.Half to Paddle-TRT.

class TensorRTPoolTest(InferencePassTest):
    def test(self):
        precision_options = [
            AnalysisConfig.Precision.Float32, AnalysisConfig.Precision.Half
        ]
        ...
        for precision, ... in itertools.product(precision_options, ...):
            ...
            with self.subTest(...):
                ...
                self.run_test()

To improve CI stability and make it more deterministic, both Paddle-Inference, Paddle-TRT and input tensor should be FP16 under FP16 unit tests.

其他补充信息 Additional Supplementary Information

No response

jeng1220 avatar Jul 27 '22 07:07 jeng1220

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

paddle-bot[bot] avatar Jul 27 '22 07:07 paddle-bot[bot]

It should be resolved by #51554

jeng1220 avatar Apr 27 '23 02:04 jeng1220

#51554 was merged. close the issue

jeng1220 avatar May 23 '23 03:05 jeng1220