I experimented using the settings provided in the example at https://huggingface.co/selfrag/selfrag_llama2_7b, but the prediction result I got was just a series of 'Model prediction: blank result'. However, when using the model at https://huggingface.co/selfrag/self_rag_critic, the results come out as expected.

from transformers import AutoTokenizer, AutoModelForCausalLM from vllm import LLM, SamplingParams

model = LLM("selfrag/selfrag_llama2_7b", download_dir="/gscratch/h2lab/akari/model_cache", dtype="half") sampling_params = SamplingParams(temperature=0.0, top_p=1.0, max_tokens=100, skip_special_tokens=False)

def format_prompt(input, paragraph=None): prompt = "### Instruction:\n{0}\n\n### Response:\n".format(input) if paragraph is not None: prompt += "[Retrieval]{0}".format(paragraph) return prompt

query_1 = "Leave odd one out: twitter, instagram, whatsapp." query_2 = "Can you tell me the difference between llamas and alpacas?" queries = [query_1, query_2]

preds = model.generate([format_prompt(query) for query in queries], sampling_params) for pred in preds: print("Model prediction: {0}".format(pred.outputs[0].text))

Expected results are below Model prediction: Twitter, Instagram, and WhatsApp are all social media platforms.[No Retrieval]WhatsApp is the odd one out because it is a messaging app, while Twitter and # Instagram are primarily used for sharing photos and videos.[Utility:5] (this query doesn't require factual grounding; just skip retrieval and do normal instruction-following generation) =>But I got the blank result

Expected results are below Model prediction: Sure![Retrieval] ... (this query requires factual grounding, call a retriever) =>But I got the blank result

generate with retrieved passage

prompt = format_prompt("Can you tell me the difference between llamas and alpacas?", paragraph="The alpaca (Lama pacos) is a species of South American camelid mammal. It is similar to, and often confused with, the llama. Alpacas are considerably smaller than llamas, and unlike llamas, they were not bred to be working animals, but were bred specifically for their fiber.") preds = model.generate([prompt], sampling_params) print([pred.outputs[0].text for pred in preds])

Expected results are below ['[Relevant]Alpacas are considerably smaller than llamas, and unlike llamas, they were not bred to be working animals, but were bred specifically for their fiber.[Fully supported][Utility:5]'] =>But I got the blank result

Jan 26 '24 07:01 leejaehoon1830

I'm curious what your question is？Maybe I can provide some help～

Jan 26 '24 11:01 fate-ubw

My problem is simliar to https://github.com/AkariAsai/self-rag/issues/30.

Jan 29 '24 01:01 leejaehoon1830

Well, I got <unk>, all the text are <unk>

Mar 03 '24 02:03 notoookay

Do you mind providing the vllm version? Not directly about Self-RAG, but I've recently encountered similar issues when I was loading Mixtral models (e.g., examples are all blank) and I wonder if this happens due to some vllm side issue...

Mar 19 '24 21:03 AkariAsai

The selfrag_llama2_7b model does not come out as in the example

generate with retrieved passage