Piotr Marcinkiewicz
Piotr Marcinkiewicz
Thank you for your question. It seems like you're trying to set the priority for inference requests when using a `DynamicBatcher` in Triton Inference Server. From the code you posted,...
I will consider your request as feature proposal.
Integrating support for Whisper using TensorRT-LLM with PyTriton for speech-to-text is an exciting challenge. However, PyTriton doesn't currently support streaming directly. For streaming audio, we might need a different setup,...
@yuekaizhang, thank you for your proposal. While I recognize the potential value of integrating your Whisper example with the PyTriton repository, there are some complexities related to the additional dependencies...
From your inquiry, it seems you are looking to run PyTriton on an environment with glibc 2.32 to serve your reward model, but are facing difficulties due to version compatibility...
Thank you for providing more details on your requirements. Since you need Python 3.10 support, but are constrained by the older glibc version, here's an adjusted approach that might work...
@martin-liu We are considering migrating PyTriton client to [tritonclient](https://github.com/triton-inference-server/client) repository. It will have new API much better aligned with Triton. We have several proposals for new API revamp. **Synchronous interface**...
I appreciate your efforts in troubleshooting the issue with Triton server. To further assist, I recommend starting by ensuring your PyTriton is updated to the latest version (0.5.3). To debug,...
Thank you for your interest in PyTriton and stateful models. I have experimented with sequence support in the client, but I realized that PyTriton does not pass sequence information to...
Let's start with simple Linear model, which takes a single input tensor and returns the negative of the input tensor: ```python import numpy as np from pytriton.decorators import batch @batch...