llama3 How to instruct the model for getting proper key value pair as json format, without getting any other text.

How to instruct the model for getting proper key value pair as json format, without getting any other text.

Open Dineshkumar-Anandan-ZS0367 opened this issue 9 months ago • 10 comments

I need to get json results from the paragraph contains key value pairs, but llam3 instruct model return json format with some unwanted string, how to get proper answer from llama3 model.

Anyother options in coding or a parameter available to get that result.

Apr 26 '24 13:04 Dineshkumar-Anandan-ZS0367

If you specify the "format" and set it to "json" you will have your desired results.

Apr 26 '24 15:04 aqib-mirza

llama3 8b instruct model, how to use this format params, can you share? Need a example or prompt related documentation.

Apr 27 '24 17:04 Dineshkumar-Anandan-ZS0367

Here is an example code """model_id = "meta-llama/Meta-Llama-3-8B-Instruct"

pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.float16}, device="cuda", token = "HF-Token" )

messages = [ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak! and return every answer in JSON format"}, {"role": "user", "content": "Who are you?"}, ]

prompt = pipeline.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, format = "JSON" )

terminators = [ pipeline.tokenizer.eos_token_id, pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>") ]

outputs = pipeline( prompt, max_new_tokens=256, eos_token_id=terminators, do_sample=True, temperature=0.6, top_p=0.9,

) print(outputs[0]["generated_text"][len(prompt):])"""

Apr 27 '24 18:04 aqib-mirza

Thanks a ton sir! I will check this.

Apr 27 '24 19:04 Dineshkumar-Anandan-ZS0367

Same prompt and same ocr text from image. Each request the llm gives different results, how can I maintain the results.

Is there any options for this, I understand this is a llm.

Can you suggest some ideas for prompt to extract key value pairs in a paragraph.

Apr 27 '24 21:04 Dineshkumar-Anandan-ZS0367

Getting same result as before inspite of using

prompt = pipeline.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, format = "JSON" )

Apr 29 '24 04:04 Dineshkumar-Anandan-ZS0367

Having the same problem. Any update on this? Or any prompt hint?

Jul 22 '24 15:07 LDelPinoNT

Having the same problem. Any update on this? Or any prompt hint?

You need to explicitly mention you JSON Structure in the prompt. Its the only way to get expected JSON format. If you have got any other tokes in output, add post process logic inside your code.

Jul 22 '24 17:07 Dineshkumar-Anandan-ZS0367

you can try lower the temperature hyperparameters @Dineshkumar-Anandan-ZS0367

Same prompt and same ocr text from image. Each request the llm gives different results, how can I maintain the results.

Is there any options for this, I understand this is a llm.

Can you suggest some ideas for prompt to extract key value pairs in a paragraph.

Jul 30 '24 08:07 YanJiaHuan

Thanks a lot for the response William

Jul 30 '24 09:07 Dineshkumar-Anandan-ZS0367

llama3 llama3 copied to clipboard

How to instruct the model for getting proper key value pair as json format, without getting any other text.

llama3
llama3 copied to clipboard