Results 172 comments of Rémi

That seems to rule out any of the extra optimizations we enabled (e.g. `--reduce_fusion` or `--user_buffer` - Llama-specific -). I don't know enough about the internals of TensorRT-LLM but maybe...

Thanks, I guess we'll need someone from Nvidia to chime in here to make progress. Given that it seems to happen in pretty different setups on a very common model...

Hi, is there any update? This issue alone makes it pretty much impossible to use TensorRT-LLM for any serious production load (unless inflight batcher is not in use).

Hi, it seems like a new (pretty big) update was released yesterday: https://github.com/triton-inference-server/tensorrtllm_backend/pull/687 + https://github.com/NVIDIA/TensorRT-LLM/pull/2725 Skimming through the diff I did not see any changes on the inflight batcher so...

@hypdeb do you have any insights on this issue by any chance? I see you have commented on similar-looking issues recently.

Hi @murenti, it is currently possible to delete a Goggle following these steps: https://github.com/brave/goggles-quickstart/blob/main/getting-started.md#deleting-a-goggle I hope that helps,

Would you be able to share the URL of the Goggle you'd like to delete? (If private we can do that via support email instead)

Hi @hicallmeal, in general it should be safe to use the experimental version of tldts but it really depends on your particular use-case. What would be the cost if the...

Hi @marcospassos and @samczsun, Depending on the specification that is followed it is unclear if underscores are allowed at all in a hostname in the first place (I believe we...