llama_index [Question]: llm completion giving bad results

Question Validation

[X] I have searched both the documentation and discord for an answer.

Question

def messages_to_prompt(messages):
  prompt = ""
  for message in messages:
    if message.role == 'system':
      prompt += f"<|system|>\n{message.content}</s>\n"
    elif message.role == 'user':
      prompt += f"<|user|>\n{message.content}</s>\n"
    elif message.role == 'assistant':
      prompt += f"<|assistant|>\n{message.content}</s>\n"

 

  # ensure we start with a system prompt, insert blank if needed
  if not prompt.startswith("<|system|>\n"):
    prompt = "<|system|>\n</s>\n" + prompt
  # add final assistant prompt
  prompt = prompt + "<|assistant|>\n"
  return prompt

 

llm = HuggingFaceLLM(
    model_name="HuggingFaceH4/zephyr-7b-alpha",
    tokenizer_name="HuggingFaceH4/zephyr-7b-alpha",
    query_wrapper_prompt=PromptTemplate("<|system|>\n</s>\n<|user|>\n{query_str}</s>\n<|assistant|>\n"),
    context_window=3900,
    max_new_tokens=256,
    model_kwargs={"quantization_config": quantization_config},
    # tokenizer_kwargs={},
    generate_kwargs={"temperature": 0.7, "top_k": 50, "top_p": 0.95},
    messages_to_prompt=messages_to_prompt,
    device_map="cuda:0",
)

 

llm.system_prompt = """<|system|>
    You are an expert code extractor, your job is to extract the code from a given text and enclose the code between like this ```<extracted code```. Remove all the other content and return only the pure code  lines.
    """


true_res = llm.complete(f"Given the following text, if code is present extract all the code present and return in a string format. If no code is present then say <NONE> \n Text: {text}")

Using the completion with llamaindex gives very bad results, im trying to extract the code alone from the text but it return all the text back in the response, how to improve this

Apr 23 '24 09:04 JINO-ROHIT

To improve the accuracy of extracting code from text using LlamaIndex and ensure it returns only the code blocks, consider the following adjustments:

Refine the System Prompt: Ensure your system prompt clearly instructs the model to extract and return only code blocks. The current prompt might be too broad or not specific enough in guiding the model to ignore non-code text.
Adjust generate_kwargs Parameters: Tweaking parameters like temperature, top_p, and top_k can significantly affect the output. A lower temperature can make the model's responses more deterministic and less prone to including irrelevant content.
Utilize Regular Expressions: If the model still returns text with code, you could post-process the output using regular expressions to extract code blocks enclosed in triple backticks (```). This can be an effective fallback to ensure you only get the code.
Review and Adjust the query_wrapper_prompt: This parameter is crucial for formatting the input query. Ensure it's set up to clearly communicate to the model that it needs to focus on extracting code from the provided text.
Optimize the context_window: The size of the context window can influence the model's performance. Ensure it's adequately set to balance between providing enough context for accurate completions and not overwhelming the model with too much information.
Explore Similar Issues: Look into similar issues in the repository or discussions around code extraction with LLMs for additional insights or solutions that have worked for others.

By focusing on these areas, you should be able to enhance the model's ability to extract just the code from the provided text, improving the relevance and accuracy of the outputs. If you encounter further issues or need more detailed guidance, feel free to ask.

Details

Sources

About Dosu This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

Apr 23 '24 09:04 dosubot[bot]

@JINO-ROHIT this has nothing to do with llama-indy really, and everything to do with the LLM and what your are asking it to do.

Either

do some iterations and prompt engineer a bit
find a different LLM (maybe codellama or something)
fine-tune a model

Also, I would recommend looking into using Ollama. The setup is 1000x easier (no need to worry about prompt templates)

Apr 23 '24 15:04 logan-markewich

llama_index llama_index copied to clipboard

[Question]: llm completion giving bad results

Question Validation

Question

Details

llama_index
llama_index copied to clipboard