Eric Buehler
Eric Buehler
Hi @ppmpreetham! This project isn't under active development by me, but I'm always open for PRs.
``` RING_PORT=1500 RING_RIGHT=1501 RING_RANK=0 RING_WORLD_SIZE=2 cargo run --features metal,ring --release '--' -i --throughput plain -m ../hf_models/llama3.2_3b RING_PORT=1501 RING_RIGHT=1500 RING_RANK=1 RING_WORLD_SIZE=2 cargo run --features metal,ring --release '--' -i --throughput plain -m...
cargo run --features cuda -- -i plain -m kaitchup/Phi-3-mini-4k-instruct-gptq-4bit -a phi3
@BuildBackBuehler If you could add this, it would be amazing! I haven't seen GPTQ kernels on Mac though, if you can find any it shouldn't be too hard to add...
@Aveline67 can you please share the code? Phi 3.5 vision instruct can support multiple images, just add messages with the correlated image!
@Aveline67 I see! It looks like this should be a fix as well as what you mentioned. Please feel free to open a PR!
@Aveline67 have you been able to publish the PR?
Looks like you selected the `Gemma` instead of the `Gemma2` architecture, this should work: ```py from mistralrs import Runner, Which, ChatCompletionRequest, Architecture runner = Runner( Which.Plain( model_id="google/gemma-2-9b-it", repeat_last_n=64, tokenizer_json=None, arch=Architecture.Gemma2,...
@agravier, closing this as I think it is not a bug, please feel free to reopen!
Glad to hear that! For questions, I would recommend our [Discussions](https://github.com/EricLBuehler/mistral.rs/discussions) page, which is not in the issue tracker, so it is more suited for questions. There is a [Discord](https://discord.com/invite/SZrecqK8qw)...