syncode
syncode copied to clipboard
Issues with built-in python, java, and go grammars
I am experiencing issues with the built-in grammars for Java, Python, and Go. The Python and Java grammars appear to be ignored, while the Go grammar produces mostly gibberish. Below is a script to reproduce the issue, along with example outputs. I installed with pip install git+https://github.com/uiuc-focal-lab/syncode.git. syncode version = 0.1 .
import torch
from syncode import SyncodeLogitsProcessor
from syncode import Grammar
from transformers import AutoModelForCausalLM, AutoTokenizer
device = 'cuda'
# model_name = "meta-llama/Llama-3.2-1B-Instruct"
model_name = "meta-llama/Llama-3.1-8B-Instruct"
cache_dir = None
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, cache_dir=cache_dir).eval().to(device)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Initialize SynCode logits processor for the given grammar
# grammar_str = """ start: month " " day
# day: /[1-9]/ | /[1-2][0-9]/ | /3[0-1]/
# month: "January" | "February" | "March" | "April" | "May" | "June" | "July" | "August" | "September" | "October" | "November" | "December"
# """
grammar_str = "python"
# grammar_str = "go"
# grammar_str = "java"
date_grammar = Grammar(grammar_str)
syncode_logits_processor = SyncodeLogitsProcessor(grammar=date_grammar, tokenizer=tokenizer, parse_output_only=True)
prompt = f"Write a {grammar_str} function that prints 'hello world' in reverse."
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
print("[PROMPT]", prompt, "\n")
syncode_logits_processor.reset(prompt)
inputs = tokenizer(prompt, return_tensors='pt').input_ids.to(device)
attention_mask = torch.ones_like(inputs)
output = model.generate(
inputs,
attention_mask=attention_mask,
max_length=512,
num_return_sequences=1,
pad_token_id=tokenizer.eos_token_id,
logits_processor=[syncode_logits_processor]
)
output_str = tokenizer.decode(output[0][len(inputs[0]):], skip_special_tokens=True)
print("[OUTPUT]", output_str)
Python
[PROMPT] <|begin_of_text|><|start_header_id|>system<|end_header_id|>
Cutting Knowledge Date: December 2023
Today Date: 26 Jul 2024
<|eot_id|><|start_header_id|>user<|end_header_id|>
Write a python function that prints 'hello world' in reverse.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
[OUTPUT] **Reversing 'Hello World' Function**
=====================================
Here is a simple Python function that prints 'hello world' in reverse:
```python
def print_reverse_hello_world():
"""
Prints 'hello world' in reverse.
"""
message = "hello world"
reversed_message = message[::-1]
print(reversed_message)
print_reverse_hello_world()
```
**
Java
[PROMPT] <|begin_of_text|><|start_header_id|>system<|end_header_id|>
Cutting Knowledge Date: December 2023
Today Date: 26 Jul 2024
<|eot_id|><|start_header_id|>user<|end_header_id|>
Write a java function that prints 'hello world' in reverse.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
[OUTPUT] interface
Here is a simple Java function that prints 'hello world' in reverse:
```java
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello World");
}
}
```
Explanation:
- The `System.out.println()` function is used to print the string "Hello World" to the console.
- The `public static void
Go
[PROMPT] <|begin_of_text|><|start_header_id|>system<|end_header_id|>
Cutting Knowledge Date: December 2023
Today Date: 26 Jul 2024
<|eot_id|><|start_header_id|>user<|end_header_id|>
Write a go function that prints 'hello world' in reverse.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
[OUTPUT] \
\
\
\
...