pengxin233 issues

Results 9 issues of


                                            pengxin233

How to use torchserve metrics

### 📚 The doc issue When I call curl http://127.0.0.1:8082/metrics, it always returns empty results, even if it is called after model inference. But there is clearly a corresponding log...

How torchserve uses grpc in java

### 📚 The doc issue I want to use grpc in the java service to call torchserve's model, but I don't seem to have found any relevant documentation. ### Suggest...

Convert torchscript model to tensorrt

Can I convert the torchscript model to tensorrt format through torch_tensorrt? Is there any corresponding script that you can give me for reference?

question

If micro_batch_size of micro-batch is set to 1, then model inference is still batch processing?

### 📚 The doc issue I set the batchSize of the registered model to 10, and then set the micro_batch_size to 1. So for model inference, will it wait for...

Model results are inconsistent between preheating and after preheating

### 🐛 Describe the bug The first time I requested torchserve, the score when preheating was inconsistent with the score after preheating was completed. But any score after the warm-up...

triaged

How to serialize and deserialize after converting onnx to tensorrt

How to serialize the converted engine after using onnx-tensorrt? I didn’t see any relevant content in the document.

Does cvcuda have the risk of memory leakage?

I found that the video memory will increase every once in a while during operation. I want to confirm whether there is a risk of video memory leakage in cvcuda...

question

need more info

can qdrant run on GPU?

Hello, can I ask if qdrant can be run on GPU? Will the performance be faster than CPU?

Ask a question about index insertion

I would like to ask qdrant when will the data be merged into the index when it is inserted?