lighteval icon indicating copy to clipboard operation
lighteval copied to clipboard

[FT] Support llama.cpp inference

Open JoelNiklaus opened this issue 11 months ago • 3 comments

Issue encountered

Currently, inference of open models on my Mac device is quite slow since vllm does not support mps.

Solution/Feature

Llama.cpp does support mps and would significantly speed up local evaluation of open models.

Posssible alternatives

Allowing the use of the mps device in other ways of loading models would also work.

JoelNiklaus avatar Nov 22 '24 09:11 JoelNiklaus