Yuchao Zhang

Results 116 comments of Yuchao Zhang

I'll collect the issues and fix in one PR

Is verbose logging or just print? I want to add logging info to that.

I invented a wheel to provide TensorRT LLM an OpenAI compatible API. Welcome to have a try https://github.com/npuichigo/openai_trtllm

> @npuichigo has mentioned an integration that could be useful here: > > [NVIDIA/TensorRT-LLM#591](https://github.com/NVIDIA/TensorRT-LLM/discussions/591) https://github.com/npuichigo/openai_trtllm provides an OpenAI-like API for trtllm triton backend, but I think vllm in triton would...

In my side, when the request traffic increases, I god 502 error and find the same error in the log of model worker

I think the behavior is related to these lines, which short circuited the error handling. https://github.com/huggingface/datasets/blob/664a1cb72ea1e6ef7c47e671e2686ca4a35e8d63/src/datasets/load.py#L946-L952 So should data_dir be checked here or still delegating to actual `DatasetModule`? In that...

Thanks. I used `tower-http` middleware to include both request/respond headers to attach to a span and it works now to show both `request_id` added as `x-request-id` and `trace_id` added by...

Any update on this? @sgugger by the way, what's the advantage of using dispatch_batches?