eryyes
eryyes
I am using a Agent and wanted to stream just the final response, do you know if that is supported already? and how to do it?
gpu显存不释放
PaddleClasModel 通过gpu启动,后面和教程中的一样,在启动多线程的同时,进行clone(),但是线程执行完后gpu不释放,请问该如何解决?
容器镜像:registry.baidubce.com/paddlepaddle/fastdeploy:1.0.7-gpu-cuda11.4-trt8.5-21.10 调用一万次后,显存直接爆了 W0314 04:50:46.438977 62225 memory.cc:135] Failed to allocate CUDA memory with byte size 79027200 on GPU 1: CNMEM_STATUS_OUT_OF_MEMORY, falling back to pinned system memory 0314 05:01:17.338640 62420 pb_stub.cc:402] Failed...
vgt能否识别中文版面?该如何实现呢?