llama.cpp [Feature Request] Add batch processing for input prompt data in embedding mode.

[Feature Request] Add batch processing for input prompt data in embedding mode.

Open AL-Kost opened this issue 2 years ago • 1 comments

It would be nice to add batch processing for input prompt data in embedding mode. I.e., read prompts from a file and output a map of prompts and their embeddings. It seems that this can be done by modifying the logic of the -f flag for embedding mode.

Apr 11 '23 08:04 AL-Kost

I second this. I've been working on integrating llama.cpp in langchain, but the retrieval of embeddings is terribly slow since we can only pass single strings (for which the model is loaded anew every time). Batch processing embeddings would be very helpful here, preferably by being able to pass a list of strings in the CLI.

Apr 13 '23 06:04 rjadr

This issue was closed because it has been inactive for 14 days since being marked as stale.

Apr 11 '24 01:04 github-actions[bot]

llama.cpp llama.cpp copied to clipboard

[Feature Request] Add batch processing for input prompt data in embedding mode.

llama.cpp
llama.cpp copied to clipboard