lighteval
lighteval copied to clipboard

Published 20 hours ago •

Reame
Issues

[FT] Support llama.cpp inference

Open JoelNiklaus opened this issue 11 months ago • 3 comments

Issue encountered

Currently, inference of open models on my Mac device is quite slow since vllm does not support mps.

Solution/Feature

Llama.cpp does support mps and would significantly speed up local evaluation of open models.

Posssible alternatives

Allowing the use of the mps device in other ways of loading models would also work.

Nov 22 '24 09:11 JoelNiklaus