Need better protection against prompt injection
Dupe Check
- [X] I have searched Warp bugs and there are no duplicates
Describe the bug
It's relatively easy to inject prompts. More robust protection is needed.
To reproduce
See screenshot
Expected behavior
No response
Screenshots
Operating system
MacOS
Operating system and version
13.0
Shell Version
5.8.1
Current Warp version
v0.2024.08.06.08.01.stable_00
Regression
No, this bug or issue has existed throughout my experience using Warp
Recent working Warp date
No response
Additional context
No response
Does this block you from using Warp daily?
No
Is this an issue only in Warp?
Yes, I confirmed that this only happens in Warp, not other terminals.
Warp Internal (ignore): linear-label:b9d78064-c89e-4973-b153-5178a31ee54e
None
Thanks for letting us know @vikramsubramanian I'll notify the team working on this feature set. We'll post any updates on this thread.
To anyone else facing this issue, please add a 👍 to the original post at the top or comment with your details, and subscribe if you'd like to be notified.
@dannyneira What is being done to protect against prompt injection from the web search feature? What prevents content Warp finds from a website from being used for prompt injection?
According to recent research on IDEsaster, this is widespread problem with similar products. Has Warp been tested for vulns like this? What was the result?
https://maccarita.com/posts/idesaster/
Prompt Injection is an industry-wide unsolved problem at the moment, affecting all LLM providers and users. For now I would say you're best defense against it is to disable web search in Settings > AI > Agent Profile > Call web tools. In addition you can set the agent permissions to "Always Ask" so that not commands or code edits can be made without your explicit permission. See more in the Web search docs: https://docs.warp.dev/agents/using-agents/web-search
Learn more about Agent profiles and permissions in the docs: https://docs.warp.dev/agents/using-agents/agent-profiles-permissions
That being said, there are things we can do to help with this, such as add an option for the agent to ask permission before adding the results of a web search to the context window, allowing the user to inspect the results for anything malicious.
Speaking for Linux in particular, there are some straightforward things to explore:
-
Since you know the app is vulnerable to prompt injection with severe consequences, only ship it on Linux in the widely-supported Flatpak format, which is sandboxed. Let other OS packagers take responsibility if they choose to repackage it in a non-sandboxed format.
-
Implement a permissions system so that Warp only has access to specific directories. Users can be prompted to add more directories or maybe even completely disable those guardrails, but Warp could at least be "secure by default" with only access to ~/Documents or something.
-
Linux has plenty of options to disable and restrict capabilities-- the system that containers are built-on. By using containers or through direct use of the capabilities systems, limit Warps default set of capabilities available to only those needed for common users. Maybe users have an option to disable those guardrails or extend the capabilities, but again Warp has an opportunity to be "secure by default".
The exploits happening in the wild due this category of vulnerability are only going to get worse as more bad actors understand how to execute them. Warp has an opportunity to lead here as most competitors have the same flaw. When a highly-publicized wave of exploits hit, that's going to be very bad for those who threw up their hands to said "it's an industry-wide problem!" and much better for the vendors who pursued a secure-by-default design.