prompt-injection icon indicating copy to clipboard operation
prompt-injection copied to clipboard

Move logic for detecting output defence (bot filtering)

Open pmarsh-scottlogic opened this issue 1 year ago • 2 comments

Right now we handle the input defence detection in handleHigherLevelChat(), but the output defence detection in chatGptSendMessage(), which strikes me as out of place. I'd like to move the output defence detection up to handleHigherLevelChat(). Which will make chatGptSendMessage() responsible for one thing fewer 👍

Also move the logic to detect output defences into defence.ts

while we're here, can we rename detectTriggeredDefences() to detectTriggeredInputDefences()

AC

Refactor ticket, so just regression testing here.

pmarsh-scottlogic avatar Dec 21 '23 13:12 pmarsh-scottlogic

This is likely to conflict with #705 if done at the same time

pmarsh-scottlogic avatar Dec 21 '23 13:12 pmarsh-scottlogic

Same comments apply here as in the testing comment for 705.

pmarsh-scottlogic avatar Jan 23 '24 15:01 pmarsh-scottlogic