guardrails
guardrails copied to clipboard
[bug] Custom LLM_Callable for ProvenanceV1 causes ValueError: invalid syntax during validator parse_token
Hello! I'm trying out the ProvenanceV1 validator using a custom llm_callable
which conforms to the signature of input string -> output string. It's calling Google's VertexAI text generation endpoint under the hood which works for me. The following
vertexai_llm = ...
def predict_function(prompt: str) -> str:
response = vertexai_llm.text_generative_model.predict(prompt)
return response.text
guard_1 = Guard.from_string(
validators=[
ProvenanceV1(
validation_method="sentence",
llm_callable=predict_function,
top_k=3,
max_tokens=2,
on_fail="fix",
)
],
description="testmeout",
)
When running I get an error I have a lot of trouble breaking apart into what's happening and what do I need to do to remedy the issue. It seems like somewhere near reasking the library is running an eval on the function?
---------------------------------------------------------------------------
SyntaxError Traceback (most recent call last)
File /usr/local/lib/python3.10/site-packages/guardrails/schema.py:144, in FormatAttr.parse_token(cls, token)
142 try:
143 # Evaluate the Python expression.
--> 144 t = eval(t)
145 except (ValueError, SyntaxError, NameError) as e:
SyntaxError: invalid syntax (<string>, line 1)
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
Cell In[41], line 6
2 response = vertexai_llm.text_generative_model.predict(prompt)
3 return response.text
----> 6 guard_1 = Guard.from_string(
7 validators=[
8 ProvenanceV1(
9 validation_method="sentence", # can be "sentence" or "full"
10 llm_callable=predict_function, # as explained above
11 # llm_callable="gpt-3.5-turbo", # as explained above
12 top_k=3, # number of chunks to retrieve
13 max_tokens=500,
14 on_fail="fix",
15 )
16 ],
17 description="testmeout",
18 )
File /usr/local/lib/python3.10/site-packages/guardrails/guard.py:203, in Guard.from_string(cls, validators, description, prompt, instructions, reask_prompt, reask_instructions, num_reasks)
180 @classmethod
181 def from_string(
182 cls,
(...)
189 num_reasks: int = None,
190 ) -> "Guard":
191 """Create a Guard instance for a string response with prompt,
192 instructions, and validations.
193
(...)
201 num_reasks (int, optional): The max times to re-ask the LLM for invalid output.
202 """ # noqa
--> 203 rail = Rail.from_string_validators(
204 validators=validators,
205 description=description,
206 prompt=prompt,
207 instructions=instructions,
208 reask_prompt=reask_prompt,
209 reask_instructions=reask_instructions,
210 )
211 return cls(rail, num_reasks=num_reasks)
File /usr/local/lib/python3.10/site-packages/guardrails/rail.py:145, in Rail.from_string_validators(cls, validators, description, prompt, instructions, reask_prompt, reask_instructions)
127 @classmethod
128 def from_string_validators(
129 cls,
(...)
135 reask_instructions: Optional[str] = None,
136 ):
137 xml = generate_xml_code(
138 prompt=prompt,
139 instructions=instructions,
(...)
143 description=description,
144 )
--> 145 return cls.from_xml(xml)
File /usr/local/lib/python3.10/site-packages/guardrails/rail.py:99, in Rail.from_xml(cls, xml)
97 if reask_instructions is not None:
98 reask_instructions = reask_instructions.text
---> 99 output_schema = cls.load_output_schema(
100 raw_output_schema,
101 reask_prompt=reask_prompt,
102 reask_instructions=reask_instructions,
103 )
105 # Parse instructions for the LLM. These are optional but if given,
106 # LLMs can use them to improve their output. Commonly these are
107 # prepended to the prompt.
108 instructions = xml.find("instructions")
File /usr/local/lib/python3.10/site-packages/guardrails/rail.py:177, in Rail.load_output_schema(root, reask_prompt, reask_instructions)
175 # If root contains a `type="string"` attribute, then it's a StringSchema
176 if "type" in root.attrib and root.attrib["type"] == "string":
--> 177 return StringSchema(
178 root,
179 reask_prompt_template=reask_prompt,
180 reask_instructions_template=reask_instructions,
181 )
182 return JsonSchema(
183 root,
184 reask_prompt_template=reask_prompt,
185 reask_instructions_template=reask_instructions,
186 )
File /usr/local/lib/python3.10/site-packages/guardrails/schema.py:785, in StringSchema.__init__(self, root, reask_prompt_template, reask_instructions_template)
778 def __init__(
779 self,
780 root: ET._Element,
781 reask_prompt_template: Optional[str] = None,
782 reask_instructions_template: Optional[str] = None,
783 ) -> None:
784 self.string_key = "string"
--> 785 super().__init__(root)
787 # Setup reask templates
788 self._reask_prompt_template = reask_prompt_template
File /usr/local/lib/python3.10/site-packages/guardrails/schema.py:293, in Schema.__init__(self, root, schema, reask_prompt_template, reask_instructions_template)
291 self.root = root
292 if root is not None:
--> 293 self.setup_schema(root)
295 # Setup reask templates
296 self.check_valid_reask_prompt(reask_prompt_template)
File /usr/local/lib/python3.10/site-packages/guardrails/schema.py:806, in StringSchema.setup_schema(self, root)
804 # make root tag into a string tag
805 root_string = ET.Element("string", root.attrib)
--> 806 self[self.string_key] = String.from_xml(root_string)
File /usr/local/lib/python3.10/site-packages/guardrails/datatypes.py:145, in DataType.from_xml(cls, element, strict)
141 # TODO: don't want to pass strict through to DataType,
142 # but need to pass it to FormatAttr.from_element
143 # how to handle this?
144 format_attr = FormatAttr.from_element(element)
--> 145 format_attr.get_validators(strict)
147 data_type = cls({}, format_attr, element)
148 data_type.set_children(element)
File /usr/local/lib/python3.10/site-packages/guardrails/schema.py:215, in FormatAttr.get_validators(self, strict)
213 _validators = []
214 _unregistered_validators = []
--> 215 parsed = self.parse().items()
216 for validator_name, args in parsed:
217 # Check if the validator is registered for this element.
218 # The validators in `format` that are not registered for this element
219 # will be ignored (with an error or warning, depending on the value of
220 # `strict`), and the registered validators will be returned.
221 if validator_name not in types_to_validators[self.element.tag]:
File /usr/local/lib/python3.10/site-packages/guardrails/schema.py:169, in FormatAttr.parse(self)
166 validators = {}
167 for token in self.tokens:
168 # Parse the token into a validator name and a list of parameters.
--> 169 validator_name, args = self.parse_token(token)
170 validators[validator_name] = args
172 return validators
File /usr/local/lib/python3.10/site-packages/guardrails/schema.py:146, in FormatAttr.parse_token(cls, token)
144 t = eval(t)
145 except (ValueError, SyntaxError, NameError) as e:
--> 146 raise ValueError(
147 f"Python expression `{t}` is not valid, "
148 f"and raised an error: {e}."
149 )
150 args.append(t)
152 return validator.strip(), args
ValueError: Python expression `<function predict_function at 0xffff4c7255a0>` is not valid, and raised an error: invalid syntax (<string>, line 1).
I'm running guardrails-ai==0.2.4
and was basing my attempts off of https://docs.guardrailsai.com/examples/provenance/#provenance-v1
The ProvenanceV1 validation works when I supply gpt-3.5-turbo
as is the default. I can't immediately see what's different with the guardrail I'm building between the two - the openai_callable
function defined in set_callable
looks pretty identical. I can't find any examples of a custom callable besides.
While the OpenAI string model value works for me during testing I cannot use that for my application and need to use Google's instead, just to clarify!
Thanks for any help you can provide! I'll continue looking into the docs / code to figure out what's happening with the validator parsing
Hi! Your assessment is correct, this is a bug in how the feature is documented vs how it works. With the current sequencing of guard runs, validator params are serialized/deserialized before they're executed. This is not loss-less when it tries to run on callables. As a result, callable-passing via parameters isn't available through validators even though the provenancev1 validator accepts a callable of that method signature.
The fix here would be to rewrite the provenancev1 validator to accept the llm_callable as metadata instead of as a param.
This hasn't currently been prioritized, but I'll try to see if we can get it released in 0.2.5 (sometime next week). I'll return with a more concrete ETA once we have it. I'll also tag this issue as a "Good first issue", I think it's something someone can get to with limited context.
Awesome, thanks @zsimjee!
This'll be solved as part of our move away from using XML serialization internally, within the next two weeks.