guidance icon indicating copy to clipboard operation
guidance copied to clipboard

Regex Doesn't support specified number of characters {n}

Open msharp9 opened this issue 1 year ago • 2 comments

The bug Regex doesn't seem to support basic regex syntax. {n}

To Reproduce Give a full working code snippet that can be pasted into a notebook cell or python file. Make sure to include the LLM load step so we know which model you are using.

# put your code snippet here
from guidance import models, gen

# Load a HuggingFace Transformers model
lm = models.Transformers("gpt2")

lm = lm + 'Generate a phone number: ' + gen(regex='\d{3}-\d{3}-\d{4}')
print(lm)
#  Generate a phone number: 1{3}-8{3}-8{4}

System info (please complete the following information):

  • OS (e.g. Ubuntu, Windows 11, Mac OS, etc.): Ubuntu
  • Guidance Version (guidance.__version__): 0.1.5

msharp9 avatar Dec 10 '23 03:12 msharp9

This is a limitation of the pyformlang library we are using right now to convert from regex patterns into guidance grammars. We don't plan to keep this dependency in the long run, so I will leave this issue open to flag that {n} support should be in the next iteration :)

slundberg avatar Dec 11 '23 23:12 slundberg

Hi! I added {m} and {n,m} in the latest version of Pyformlang. Do not hesitate to open issues directly on Pyformlang in the future :)

Aunsiels avatar Mar 18 '24 13:03 Aunsiels