NeMo-Guardrails icon indicating copy to clipboard operation
NeMo-Guardrails copied to clipboard

bug: Cannot sanitize user input

Open vrige opened this issue 1 year ago • 1 comments

Did you check docs and existing issues?

  • [x] I have read all the NeMo-Guardrails docs
  • [x] I have updated the package to the latest version before submitting this issue
  • [ ] (optional) I have used the develop branch
  • [x] I have searched the existing issues of NeMo-Guardrails

Python version (python --version)

Python 3.10.15

Operating system/version

MacOs 14

NeMo-Guardrails version (if you must use a specific version and not the latest

No response

Describe the bug

Hi, thanks for the amazing work you are doing.

I am having some problems related to input sanitization using a custom action (sanitizeInputAction). The action works just fine when looking at the logs, but the LLM receives the initial input instead of the sanitized one.

I have followed the steps from the documentation:

  • Input Rails Only

  • I also found the following test for colang.v1 here, but it doesn't seem to be helpful for colang.v2. I haven't found a corresponding case in v2 yet.

An example with fake data:

user: "Hi! Can you repeat the following name: Tom Jerry. Thanks"
llm: "Hi! Can you repeat the following name: <PERSON>. Thanks
Sure, the name is Tom Jerry. Did you want me to repeat it again?"

I am open to any suggestions.

Is there something I missed? Do I need to add more code?

Thanks for your help!

Steps To Reproduce

colang_v2 = """
    import core 
    import llm

    flow main
        $ref_act = await analyse_input 
        $output = ... "'{$ref_act}'"

    flow analyse_input-> $result
        user said something as $ref_use   
        $result = await sanitizeInputAction(inputs=$ref_use.transcript)
        return $result
    """

config_v2_o = """
    colang_version: 2.x
    rails:
        input:
            flows:
                - main
    models:
      - type: main
        engine: openai
        model: gpt-3.5-turbo-instruct
    """

Expected Behavior

I expect:

  • the flow to be triggered for any user input text (it's working)
  • the sanitization of the input text (it's working)
  • a reply must be generated by the llm using only the sanitized input text (not working)
  • not repeating the input text (not working)

Actual Behavior

I expect:

  • the flow to be triggered for any user input text
  • the sanitization of the input text
  • llm is using personal data as input
  • a repetition of the input sanitized text

vrige avatar Dec 02 '24 12:12 vrige

@vrige thanks for opening this issue. I cannot see your action's implementation and I guess the issue stems from there and is not NeMo Guardrails bug, that is why I'd like you to try followings:

Please have a look at how a similar functionality is implemented here specially pay attention to the return value of mask_sensitive_data function.

re Colang 2.0 implementation of such flows: flows.co file is how this should be integrated in Colang 2.0 and flows.v1.co for Colang 1.0. You need a flow like mask sensitive data on input

Pouyanpi avatar Jan 06 '25 14:01 Pouyanpi