EthanYe
EthanYe
> @yeliang2258 Please update your PR to latest codebase . We would like to evaluate it by our CI to check impact on all models we trace which will take...
@jczaja Hello, I want to know, how can I print out the specific data in the onednnl memory
@Silv3S Thank you for your reply. The solution in your branch does work, but when the shape is different, what is the problem with the code I submitted? I create...
@Silv3S Hello, I had a user problem, same as the one here, I solved it using the solution you provided, and I made a PR to fix conv + elementwise_add...
@jczaja Please help me to review this PR, thanks.
@Silv3S Please help me to review this PR, thanks.
我看你这边是对相同的输入重复计算了10次求均值,麻烦再测试一下,先warmup 100轮,然后计算1000轮的均值?看看耗时怎么样
> > 最好能提供一下全套的测试代码(paddle和pytorch)给我们,因为之前我和Huggface的模型也是对比过,测试的结论和您提供的结论差别较大 > > 有没有可能是我这边没有用上GPU?在CPU推理的时候,paddle确实有优势。但是做GPU推理的时候,在代码中我已经配置了CUDAExecutionProvider,代码也成功运行,没有报错。如果可以,能不能提供一个能同时满足paddlenlp训练和onnxruntime-gpu的docker,让再测一下? 生成predictor之后,调用一下predictor.get_providers(),看看里面是否有CUDAExecutionProvider,另外运行起来中后看看显存和GPU利用率
请尝试使用下面命令安装,pip install packaging
你好,请提供一下模型,谢谢