biaochen issues

Results 6 issues of


                                            biaochen

change select with poll or epoll

select is used as the I/O multiplexing tool, can it be changed with poll or epoll

Serve tf-trt converted model return error: NodeDef mentions attr 'max_batch_size' not in Op: name=TRTEngineOp

I want to use tf-trt to optimize a tf2 model, and then serve with triton. But fail to serve the optimized tf-trt model. Following is the process: 1. following this...

run demo generation failed

### System Info x86_64 V100 triton server image: nvcr.io/nvidia/tritonserver:23.12-trtllm-python-py3 tensorrtllm_backend: v0.7.1 ### Who can help? _No response_ ### Information - [X] The official example scripts - [ ] My own...

bug

triaged

speculative decoding performance

I've tested speculative decoding feature using llama3 models; I convert draft/target model to trt engine, and launch triton server with bls model, but there seems no performance gain. environment settings:...

[Bug] Eagle fail on Llama3-8b

### Checklist - [ ] 1. I have searched related issues but cannot get the expected help. - [ ] 2. The bug has not been fixed in the latest...

speculative decoding not work

Hi Team, I'm testing speculative decoding feature with trtllm, but meet some issue. Following is my settings: hardware: A100 80G software: nvcr.io/nvidia/tritonserver:25.01-trtllm-python-py3 model: gemma-2-2b-it / gemma-2-27b-it ``` cd /llm/tmp/trtllm/v0.17/TensorRT-LLM/examples/gemma/ ```...

triaged

stale

waiting for feedback