Benjamin Bartels

Results 31 comments of Benjamin Bartels

Would https://github.com/microsoft/PowerToys/pull/32125 fix https://github.com/microsoft/PowerToys/issues/10393 as well? EDIT: Just confirmed by compiling locally this does actually fix https://github.com/microsoft/PowerToys/issues/10393 🎉

@yiyeguhu Could you fix the pre-commit issues, all other tests seem to have passed! See here: https://github.com/vllm-project/vllm/actions/runs/19263038741/job/55072168601?pr=24847

> @bbartels Could you help submit a new PR based on this one? @chaunceyjiang Have a look at https://github.com/vllm-project/vllm/pull/28831

Out of curiosity, would this be a reasonable ticket to pick up for a community contribution? Am I correct in assuming that this wouldn't really be touching query pipeline logic?...

We send some information about a generated request body in trailer headers as this information is only available once the full response was generated. Sadly neither requests nor httpx supports...

@tomchristie Would requiring a user to specifically opt-in to receiving trailer headers for a given request be sufficient to get this moving? That way for the 99.9% case there is...

> It's more the code overhead that's an issue. Particularly since the HTTP/2 implementation is already fiddly and hard to follow. Fair point, I am relatively unfamiliar with the project...

`flashinfer show-config` ``` root@machine:/vllm-workspace$ flashinfer show-config === Version Info === FlashInfer version: 0.5.2 flashinfer-cubin version: 0.5.2 flashinfer-jit-cache version: 0.5.2+cu129 === Torch Version Info === Torch version: 2.9.0+cu129 CUDA runtime available:...

Here is the output of `find .` in `/usr/local/lib/python3.12/dist-packages/flashinfer_cubin/cubins` https://gist.github.com/bbartels/940176ed78e8e682bf7099e94b1d7f4c We do operate in an air-gapped environment, if that makes a difference

@yzh119 `docker pull vllm/vllm-openai:nightly-0b25498990f01ea2553c02731d6e2ce2d550156a` For reference, this is how flashinfer is installed in the docker image: `https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile#L362` We are running things on H200 nodes