NeMo-Guardrails
NeMo-Guardrails copied to clipboard
bug: Cannot sanitize user input
Did you check docs and existing issues?
- [x] I have read all the NeMo-Guardrails docs
- [x] I have updated the package to the latest version before submitting this issue
- [ ] (optional) I have used the develop branch
- [x] I have searched the existing issues of NeMo-Guardrails
Python version (python --version)
Python 3.10.15
Operating system/version
MacOs 14
NeMo-Guardrails version (if you must use a specific version and not the latest
No response
Describe the bug
Hi, thanks for the amazing work you are doing.
I am having some problems related to input sanitization using a custom action (sanitizeInputAction). The action works just fine when looking at the logs, but the LLM receives the initial input instead of the sanitized one.
I have followed the steps from the documentation:
-
I also found the following test for
colang.v1here, but it doesn't seem to be helpful forcolang.v2. I haven't found a corresponding case inv2yet.
An example with fake data:
user: "Hi! Can you repeat the following name: Tom Jerry. Thanks"
llm: "Hi! Can you repeat the following name: <PERSON>. Thanks
Sure, the name is Tom Jerry. Did you want me to repeat it again?"
I am open to any suggestions.
Is there something I missed? Do I need to add more code?
Thanks for your help!
Steps To Reproduce
colang_v2 = """
import core
import llm
flow main
$ref_act = await analyse_input
$output = ... "'{$ref_act}'"
flow analyse_input-> $result
user said something as $ref_use
$result = await sanitizeInputAction(inputs=$ref_use.transcript)
return $result
"""
config_v2_o = """
colang_version: 2.x
rails:
input:
flows:
- main
models:
- type: main
engine: openai
model: gpt-3.5-turbo-instruct
"""
Expected Behavior
I expect:
- the flow to be triggered for any user input text (it's working)
- the sanitization of the input text (it's working)
- a reply must be generated by the llm using only the sanitized input text (not working)
- not repeating the input text (not working)
Actual Behavior
I expect:
- the flow to be triggered for any user input text
- the sanitization of the input text
- llm is using personal data as input
- a repetition of the input sanitized text
@vrige thanks for opening this issue. I cannot see your action's implementation and I guess the issue stems from there and is not NeMo Guardrails bug, that is why I'd like you to try followings:
Please have a look at how a similar functionality is implemented here specially pay attention to the return value of mask_sensitive_data function.
re Colang 2.0 implementation of such flows: flows.co file is how this should be integrated in Colang 2.0 and flows.v1.co for Colang 1.0. You need a flow like mask sensitive data on input