llama.cpp
llama.cpp copied to clipboard
llama : add T5 (encoder-decoder) support
Still not familiar with the details, but it seems it would be useful to support this architecture in llama.cpp
. First, need to decide on the API and see what changes would be necessary
See discussion here: https://github.com/ggerganov/llama.cpp/issues/247
@ggerganov Does this mean llama.cpp could support something like the new GritLM model which can handle both text representations and text generation? I tried the embedding sample with gritlm but the resulting embeddings don't look right.
Some references: https://github.com/ContextualAI/gritlm/blob/92025b16534712b31b3c4aaaf069350e222bd5f8/gritlm/gritlm.py#L93 https://huggingface.co/GritLM/GritLM-7B
The issue is about different architecture (encoder + decoder). GritLM looks like decoder-only Mistral fine-tune, so it should already work. If you think the results are not OK, you can open an issue with steps to reproduce
I am looking forward to this. How many work would be needed to implement this?
@dranger003 Probably that's because GritLM uses 2 prompt templates, one is used only for text generation and one only for embedding. Can you try embedding with the template specified by the author?
Feel free to open a dedicated issue to discuss in details.
@ngxson thanks, I used the proper template. I opened an issue with a sample program.
T5 support would be truly awesome, expanding opportunities for numerous enterprise use cases.
This issue was closed because it has been inactive for 14 days since being marked as stale.
Any update on this issue?