guardrails `RegexMatch` not reporting failures [bug]

Describe the bug

When a validation fails with RegexMatch it is listed as successfully validating.

To Reproduce

import openai
from pydantic import BaseModel, Field
from guardrails.hub import RegexMatch
from guardrails import Guard

prompt = """
    Generate a fake user.
    
    ${gr.complete_json_suffix_v2}
"""

class User(BaseModel):
    name: str = Field(description="Name that is NOT potato", validators=[
        RegexMatch(regex='potato', match_type='search', on_fail='reask')
    ])

guard = Guard.from_pydantic(output_class=User, prompt=prompt)

result = guard(
    llm_api=openai.chat.completions.create,
    num_reasks=0
)

The LLM gives us "John Doe"as the name, which fails the potato check. Despite this it's listed as passing validation, and the "fixed" value is provided.

ValidationOutcome(
    raw_llm_output='{"name":"John Doe"}', 
    validated_output={'name': 'uhdazcppotatouhdazcp'}, 
    reask=None, 
    validation_passed=True,
    error=None
)

Expected behavior

If you replace the RegexpMatch validator with ValidChoices(choices=['potato'], on_fail='reask'), you get the following (Alice fails the potato check).

ValidationOutcome(
    raw_llm_output='{"name":"Alice"}',
    validated_output=None,
    reask=None,
    validation_passed=False,
    error=None
)

Library version:

0.4.3

Apr 11 '24 09:04 jsoma

Hi! Can you please try with v 0.4.2 and see if that yields the same issue? I did mess with related logic in 0.4.3 which came out on Tuesday

Apr 11 '24 15:04 zsimjee

I believe this is applying fix values even when the on fail action is reask

Apr 11 '24 15:04 zsimjee

Nope, no luck even back to 0.4.0 (although my env is a little wonky and guardrails doesn't have __version__, so there's like a 3% chance it's just my setup not downgrading successfully)

Apr 11 '24 19:04 jsoma

@zsimjee and @jsoma Wanted to add some context here:

This is the expected behaviour and has been since 0.3.x and likely even before. That behaviour being that after all validation and reasks are complete, if there are any reasks left where the ValidationResult has a fix_value, that fix_value is applied. The key difference between RegexMatch and ValidChoices is that ValidChoices does not include a fix_value in its ValidationResult hence a failure instead of a substituted result.

I agree that this behaviour may not be intuitive and should be examined for revision, but the library is acting as it is intended to.

Edit: typo

Apr 16 '24 13:04 CalebCourier

Ah, got it. Guess it gives me a good opportunity to get a little more comfortable with custom validators.

@register_validator(name="regex-validator", data_type="string")
class RegexValidator(Validator):
    def __init__(self, regex: str, on_fail: Optional[Callable] = None):
        super().__init__(on_fail=on_fail, regex=regex)
        self._regex = regex

    def validate(self, value: str, metadata: Dict) -> ValidationResult:
        regex = re.compile(self._regex)
        if not regex.fullmatch(value):
            return FailResult(
                error_message=f"Result must match regular expression /{self._regex}/",
            )
        return PassResult()

Thanks for the update!

Apr 17 '24 01:04 jsoma

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 14 days.

Aug 11 '24 01:08 github-actions[bot]

This issue was closed because it has been stalled for 14 days with no activity.

Aug 25 '24 03:08 github-actions[bot]

guardrails guardrails copied to clipboard

`RegexMatch` not reporting failures [bug]

guardrails
guardrails copied to clipboard