llama3 icon indicating copy to clipboard operation
llama3 copied to clipboard

How to instruct the model for getting proper key value pair as json format, without getting any other text.

Open Dineshkumar-Anandan-ZS0367 opened this issue 9 months ago • 10 comments

I need to get json results from the paragraph contains key value pairs, but llam3 instruct model return json format with some unwanted string, how to get proper answer from llama3 model.

or

Anyother options in coding or a parameter available to get that result.

If you specify the "format" and set it to "json" you will have your desired results.

aqib-mirza avatar Apr 26 '24 15:04 aqib-mirza

llama3 8b instruct model, how to use this format params, can you share? Need a example or prompt related documentation.

Here is an example code """model_id = "meta-llama/Meta-Llama-3-8B-Instruct"

pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.float16}, device="cuda", token = "HF-Token" )

messages = [ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak! and return every answer in JSON format"}, {"role": "user", "content": "Who are you?"}, ]

prompt = pipeline.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, format = "JSON" )

terminators = [ pipeline.tokenizer.eos_token_id, pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>") ]

outputs = pipeline( prompt, max_new_tokens=256, eos_token_id=terminators, do_sample=True, temperature=0.6, top_p=0.9,

) print(outputs[0]["generated_text"][len(prompt):])"""

aqib-mirza avatar Apr 27 '24 18:04 aqib-mirza

Thanks a ton sir! I will check this.

Same prompt and same ocr text from image. Each request the llm gives different results, how can I maintain the results.

Is there any options for this, I understand this is a llm.

Can you suggest some ideas for prompt to extract key value pairs in a paragraph.

Getting same result as before inspite of using

prompt = pipeline.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, format = "JSON" )

Having the same problem. Any update on this? Or any prompt hint?

LDelPinoNT avatar Jul 22 '24 15:07 LDelPinoNT

Having the same problem. Any update on this? Or any prompt hint?

You need to explicitly mention you JSON Structure in the prompt. Its the only way to get expected JSON format. If you have got any other tokes in output, add post process logic inside your code.

you can try lower the temperature hyperparameters @Dineshkumar-Anandan-ZS0367

Same prompt and same ocr text from image. Each request the llm gives different results, how can I maintain the results.

Is there any options for this, I understand this is a llm.

Can you suggest some ideas for prompt to extract key value pairs in a paragraph.

YanJiaHuan avatar Jul 30 '24 08:07 YanJiaHuan

Thanks a lot for the response William