lmql
lmql copied to clipboard
Quotes aren't always parsed properly
For example:
import lmql
@lmql.query()
def quote():
'''lmql
"\"[VAL]\""
return VAL
'''
Fails with the error:
File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\IPython\core\interactiveshell.py:3553](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/IPython/core/interactiveshell.py:3553) in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
Cell In[16], [line 1](vscode-notebook-cell:?execution_count=16&line=1)
@lmql.query()
File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\api\queries.py:108](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/api/queries.py:108) in wrapper
return query(fct, input_variables=input_variables, is_async=is_async, calling_frame=calling_frame, **extra_args)
File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\api\queries.py:130](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/api/queries.py:130) in query
module = load(temp_lmql_file, output_writer=silent)
File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\api\queries.py:22](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/api/queries.py:22) in load
module = compiler.compile(filepath)
File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\language\compiler.py:924](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/language/compiler.py:924) in compile
transformations.transform(q)
File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\language\compiler.py:789](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/language/compiler.py:789) in transform
t = T(query).transform()
File [f:\workspace\lmql-pydantic\.venv\Lib\site-packages\lmql\language\compiler.py:346](file:///F:/workspace/lmql-pydantic/.venv/Lib/site-packages/lmql/language/compiler.py:346) in transform
self.query.prompt = [self.visit(p) for p in self.query.prompt]
...
File <unknown>:1
f""""[VAL]""""
^
SyntaxError: unterminated string literal (detected at line 1)
There is an ugly workaround for now:
@lmql.query()
def quote():
'''lmql
q = "\""
"\"[VAL]{q}"
return VAL
'''
Thanks for reporting. Marking this as a good first issue.
The fix is likely somewhere close to https://github.com/eth-sri/lmql/blob/main/src/lmql/language/compiler.py#L428, where we compile LLM query strings into multi-line strings in the compiled representation of the program.
I did some investigation and here are my findings:
import lmql
@lmql.query()
def quote_at_begin():
'''lmql
"\"x\"=123"
'''
# This case passes without any errors.
@lmql.query()
def quote_at_end():
'''lmql
"123=\"x\""
'''
# This triggers a SyntaxError:
# SyntaxError: unterminated string literal (detected at line 1)
So only the quote_at_end causes error.
I feel this may be a bug of python ast parser ?
because it can parse ast.parse(f""""x"==123""") works but not ast.parse(f"""123=="x"""") ?
A temporary workaround is to check if we ends with quote and add a space after it to make the parser working and then remove the space immediately after parsing.
I think the behavior of ast.parse is actually correct here. In Python, """"a""" is valid, whereas """a"""" is not valid. This is because after reading """ a parser's scanner will look for the next """ and then terminate the current string terminal. An extra " at the end of such a string will thus be read as an unterminated string literal.
To fix this issue, I think it should be enough to just add:
if compiled_qstring.endswith("\""):
compiled_qstring = compiled_qstring[:-1] + "\\\""
This prevents a """" (four quotes) sequences in compiled_qstring, as the last quote in a qstring will always be escaped. This may need some more testing to check whether it covers all the cases correctly though.