Sam Stoelinga comments

Results 223 comments of


                                            Sam Stoelinga

[Bug]: ModuleNotFoundError: No module named 'triton' when building docker image for Arm64

Facing the same when trying to build gh200 image for vllm 0.8.1. Did any of you figure out a workaround?

[Bug]: ModuleNotFoundError: No module named 'triton' when building docker image for Arm64

I was able to get it to work by building triton 3.2.x from source. Working image for GH200 for vllm 0.8.1: ``` substratusai/vllm-gh200:v0.8.1 ``` Example docker run: ``` docker run...

[Bug]: ModuleNotFoundError: No module named 'triton' when building docker image for Arm64

Yeah I did make changes to torch related code since I ran this: `python3 use_existing_torch.py`. Note my image seems to be having an issue with flash-infer not being installed. Still...

[Bug]: ModuleNotFoundError: No module named 'triton' when building docker image for Arm64

I ended up changing the approach and got 0.8.2 docker image working instead: https://github.com/vllm-project/vllm/issues/10459#issuecomment-2759853357

Consider supporting a reranker model?

Do you have a specific reranker API, model and engine in mind? OpenAI doesn't provide a reranker API. I am not too familiar with reranking use case. So please provide...

Consider supporting a reranker model?

Edit: The docs call out that Infinity adheres to Cohere API, which is great. I think maybe that's our shortest route to support reranker models. @michaelfeil what's the API that...

Consider supporting a reranker model?

Seems Cohere is also supported in other solutions: https://docs.continue.dev/customize/model-types/reranking I think supporting a Reranker API would be great for fully local private code completion with continue.dev. We support all endpoints...

Consider supporting a reranker model?

It's definitely on the roadmap and I think I have all the info I would need. Right now our focus is on adding support for PVC support and Prefix Cache...

Consider supporting a reranker model?

Development hasn't started yet. We've been too busy with other things and this is a relatively larger feature. If you need this urgently please reach out to me on LinkedIn.

Allow Updating Model URL for Seamless Model Replacement

Thank you for sharing this! We've been thinking of introducing a ModelAlias so you can have multiple models backing the same model endpoint. Imagine the use case where you have...