llama_index
llama_index copied to clipboard
[Question]: llm completion giving bad results
Question Validation
- [X] I have searched both the documentation and discord for an answer.
Question
def messages_to_prompt(messages):
prompt = ""
for message in messages:
if message.role == 'system':
prompt += f"<|system|>\n{message.content}</s>\n"
elif message.role == 'user':
prompt += f"<|user|>\n{message.content}</s>\n"
elif message.role == 'assistant':
prompt += f"<|assistant|>\n{message.content}</s>\n"
# ensure we start with a system prompt, insert blank if needed
if not prompt.startswith("<|system|>\n"):
prompt = "<|system|>\n</s>\n" + prompt
# add final assistant prompt
prompt = prompt + "<|assistant|>\n"
return prompt
llm = HuggingFaceLLM(
model_name="HuggingFaceH4/zephyr-7b-alpha",
tokenizer_name="HuggingFaceH4/zephyr-7b-alpha",
query_wrapper_prompt=PromptTemplate("<|system|>\n</s>\n<|user|>\n{query_str}</s>\n<|assistant|>\n"),
context_window=3900,
max_new_tokens=256,
model_kwargs={"quantization_config": quantization_config},
# tokenizer_kwargs={},
generate_kwargs={"temperature": 0.7, "top_k": 50, "top_p": 0.95},
messages_to_prompt=messages_to_prompt,
device_map="cuda:0",
)
llm.system_prompt = """<|system|>
You are an expert code extractor, your job is to extract the code from a given text and enclose the code between like this ```<extracted code```. Remove all the other content and return only the pure code lines.
"""
true_res = llm.complete(f"Given the following text, if code is present extract all the code present and return in a string format. If no code is present then say <NONE> \n Text: {text}")
Using the completion with llamaindex gives very bad results, im trying to extract the code alone from the text but it return all the text back in the response, how to improve this
To improve the accuracy of extracting code from text using LlamaIndex and ensure it returns only the code blocks, consider the following adjustments:
-
Refine the System Prompt: Ensure your system prompt clearly instructs the model to extract and return only code blocks. The current prompt might be too broad or not specific enough in guiding the model to ignore non-code text.
-
Adjust
generate_kwargs
Parameters: Tweaking parameters liketemperature
,top_p
, andtop_k
can significantly affect the output. A lower temperature can make the model's responses more deterministic and less prone to including irrelevant content. -
Utilize Regular Expressions: If the model still returns text with code, you could post-process the output using regular expressions to extract code blocks enclosed in triple backticks (```). This can be an effective fallback to ensure you only get the code.
-
Review and Adjust the
query_wrapper_prompt
: This parameter is crucial for formatting the input query. Ensure it's set up to clearly communicate to the model that it needs to focus on extracting code from the provided text. -
Optimize the
context_window
: The size of the context window can influence the model's performance. Ensure it's adequately set to balance between providing enough context for accurate completions and not overwhelming the model with too much information. -
Explore Similar Issues: Look into similar issues in the repository or discussions around code extraction with LLMs for additional insights or solutions that have worked for others.
By focusing on these areas, you should be able to enhance the model's ability to extract just the code from the provided text, improving the relevance and accuracy of the outputs. If you encounter further issues or need more detailed guidance, feel free to ask.
Details
Sources
- [Help] Providing the full/better scope to the LLM
- llama-index-core/llama_index/core/output_parsers/utils.py
- llama-index-legacy/llama_index/legacy/llms/huggingface.py
- llama-index-integrations/llms/llama-index-llms-bedrock/llama_index/llms/bedrock/llama_utils.py
- llama-index-integrations/llms/llama-index-llms-llama-cpp/llama_index/llms/llama_cpp/llama_utils.py
- llama-index-integrations/llms/llama-index-llms-huggingface/llama_index/llms/huggingface/base.py
- llama-index-integrations/llms/llama-index-llms-huggingface/tests/test_huggingface.py
About Dosu This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.
@JINO-ROHIT this has nothing to do with llama-indy really, and everything to do with the LLM and what your are asking it to do.
Either
- do some iterations and prompt engineer a bit
- find a different LLM (maybe codellama or something)
- fine-tune a model
Also, I would recommend looking into using Ollama. The setup is 1000x easier (no need to worry about prompt templates)